🇮🇹 🇪🇪 🖥

  • 0 Posts
  • 131 Comments
Joined 9 months ago
cake
Cake day: March 19th, 2024

help-circle









  • Humans are notoriously worse at tasks that have to do with reviewing than they are at tasks that have to do with creating. Editing an article is more boring and painful than writing it. Understanding and debugging code is much harder than writing it etc., observing someone cooking to spot mistakes is more boring than cooking etc.

    This also fights with the attention required to perform those tasks, which means a higher ratio of reviewing vs creating tasks leads to lower quality output because attention is depleted at some point and mistakes slip in. All this with the additional “bonus” to have to pay for the tool AND the human reviewing while also wasting tons of water and energy. I think it’s wise to ask ourselves whether this makes sense at all.


  • I would argue that this makes the process microscopically more efficient and macroscopically way less efficient. That whole process probably is useless, and imagine wasting so much energy, water and computing power just to speed this useless process up and saving a handful of minutes (I am a lead and it takes me 2/3 minutes to put together a status of my team, and I don’t usually even request a status from each member).

    I keep saying this to everyone in my company who pushes for LLMs for administrative tasks: if you feel like LLMs can do this task, we should stop doing it at all because it means we are just going through the motions and pleasing a process without purpose. You will have people producing reports via LLM from a one-line prompt, the manager assembling it together with LLM and at vest someone reading it distilling it once again with LLMs. It is all a great waste of money, energy, time, cognitive effort that doesn’t benefit anybody.

    As soon as someone proposes to introduce LLMs in a process, raise with cutting that process altogether. Let’s produce less bullshit, instead of more while polluting even more in the process.


  • Just to precise, when I said bruteforce I didn’t imagine a bruteforce of the calculation, but a brute force of the code. LLMs don’t really calculate either way, but what I mean is more: generate code -> try to run and see if tests work -> if it doesn’t ask again/refine/etc. So essentially you are just asking code until what it spits out is correct (verifiable with tests you are given).

    But yeah, few years ago this was not possible and I guess it was not due to the training data. Now the problem is that there is not much data left for training, and someone (Bloomberg?) reported that training chatGPT 5 will cost billions of dollars, and it looks like we might be near the peak of what this technology could offer (without any major problem being solved by it to offset the economical and environmental cost).

    Just from today https://www.techspot.com/news/106068-openai-struggles-chatgpt-5-delays-rising-costs.html


  • I don’t see the change. Sure, there are spam websites with AI content that were not there before, but is this news business at all? All major publishers and newspapers don’t (seem to) use AI as far as I can tell.

    Also I would argue this is no much of a change except maybe in simplicity to generate fluff. All of this existed already for 20 years now, and it’s a byproduct of the online advertisement business (that for sure was a major change in society!). AI pieces are just yet another way to generate content in the hope of getting views.


  • Maybe some postmortem analysis will be interesting. The AoC is also a context in which the domain is self-contained and there is probably a ton of training material on similar problems and tasks. I can imagine LLM might do decently there.

    Also there is no big consequence if they don’t and it’s probably possible to bruteforce (which is how many programming tasks have been solved).


  • Agree to disagree.

    There is a lot that can be discussed in a philosophical debate. However, any 8 years old would be able to count how many letters are in a word. LLMs can’t reliably do that by virtue of how they work. This suggests me that it’s not just a model/training difference. Also evolution over million of years improved the “hardware” and the genetic material. Neither of this is compares to computing power or amount of data which is used to train LLMs.

    I believe a lot of this conversation stems from the marketing (calling “intelligence”) and the anthropomorphization of AI.

    Anyway, time will tell. Personally I think it’s possible to reach a general AI eventually, I simply don’t think the LLMs approach is the one leading there.


  • As much as I agree with you, humans can learn a bunch of stuff without first learning the content of the whole internet and without the computing power of a datacenter or consuming the energy of Belgium. Humans learn to count at an early age too, for example.

    I would say that the burden of proof is therefore reversed. Unless you demonstrate that this technology doesn’t have the natural and inherent limits that statistical text generators (or pixel) have, we can assume that our mind works differently.

    Also you say immature technology but this technology is not fundamentally (I.e. in terms of principle) different from what Weizenabum’s ELIZA in the '60s. We might have refined model and thrown a ton of data and computing power at it, but we are still talking of programs that use similar principles.

    So yeah, we don’t understand human intelligence but we can appreciate certain features that absolutely lack on GPTs, like a concept of truth that for humans is natural.



  • There is a bunch of research showing that model improvement is marginal compared to energy demand and/or amount of training data. OpenAI itself ~1 month ago mentioned that they are seeing a smaller improvements in Orion (I believe) vs GPT4 than there was between GPT 4 and 3. We are also running out of quality data to use for training.

    Essentially what I mean is that the big improvements we have seen in the past seem to be over, now improving a little cost a lot. Considering that the costs are exorbitant and the difference small enough, it’s not impossible to imagine that companies will eventually give up if they can’t monetize this stuff.



  • That is my experience, it’s generally quite decent for small and simple stuff (as I said, distillation of documentation). I use it for rust, where I am sure the training material was much smaller than other languages. It’s not a matter a prompting though, it’s not my prompt that makes it hallucinate functions that don’t exist in libraries or make it write code that doesn’t compile, it’s a feature of the technology itself.

    GPTs are statistical text generators after all, they don’t “understand” the problem.