Breakthrough Technique: Meta-learning for Compositionality

Original :
https://www.nature.com/articles/s41586-023-06668-3

Vulgarization :
https://scitechdaily.com/the-future-of-machine-learning-a-new-breakthrough-technique/

How MLC Works
In exploring the possibility of bolstering compositional learning in neural networks, the researchers created MLC, a novel learning procedure in which a neural network is continuously updated to improve its skills over a series of episodes. In an episode, MLC receives a new word and is asked to use it compositionally—for instance, to take the word “jump” and then create new word combinations, such as “jump twice” or “jump around right twice.” MLC then receives a new episode that features a different word, and so on, each time improving the network’s compositional skills.

  • A_A@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    1
    ·
    edit-2
    1 year ago

    Edit : Please read @DigitalMus@feddit.dk’s comment before mine.


    Hey folks, I believe this is really big.

    Traditional deep neural network’s training requires millions of example and so, despite its great success, is immensely inefficient.

    Now what if learning of these machines was as fast or faster than a human’s ? Well, it seems this is it.

    Look at how large language models are disruptive for many sectors of society. This new technology could accelerate the process exponentially.

    • TropicalDingdong@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      1 year ago

      Traditional deep neural network’s training requires millions of example and so, despite its great success, is immensely inefficient.

      Is this a limited advancement in training techniques? Right now I’m working on several types of image classification models. How would this be able to help me?

          • A_A@lemmy.worldOP
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            1 year ago

            I am not sure what “image classification models” incompasses. I would have to read more and understand and I don’t have enough time and energy.
            Yet in the past I have read and understand a few books about neural networks and this new article in nature is something else : it’s clear when reading it.
            ( also to @TropicalDingdong@lemmy.world )

            • TropicalDingdong@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              1
              ·
              1 year ago

              I mean is this any different than standard gradient descent with something like Adam as optimiser.

              That’s my assumption based on the headline. But the quick skim I gave the article seemed to only discuss it in the context of NLP. Not exactly my field of study.

          • QueriesQueried@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            1 year ago

            Admittedly, they were quoting someone else in the message you responded to. That may have been edited after the fact, but the person they’re quoting did in fact say those words (“this is big”).

            It was I who couldn’t read, as that is not what happened.

    • Stantana@lemmy.sambands.net
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      It’s over for us useless eaters, no matter how useful one is we will always be useless compared to what’s coming.

      • A_A@lemmy.worldOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 year ago

        There is still some hope ; maybe the machines will have more compassion than humans do, or maybe we are inside the matrix already as useful parts.
        There are so many unknowns in the future and our insights are so limited.

    • ExLisper@linux.community
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Now what if learning of these machines was as fast or faster than a human’s ?

      What do you mean? It’s already faster than human’s. I takes years for a person to learn basic language and decades to gain expert knowledge in any field.

      • A_A@lemmy.worldOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        What is meant here (and said as such in the article) is that humans can learn from a single example while deep neural networks takes thousands or millions (of examples) to learn.

        • ExLisper@linux.community
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Ok, but neural networks can process way more examples per second so ‘faster’ is not really the right term here.

          • A_A@lemmy.worldOP
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Yes you are right. And I was hoping for someone more knowledgeable to help clarify this topic.

            Well I was lucky with the comment of @DigitalMus in here, if you would like to read it.

  • DigitalMus@feddit.dk
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 year ago

    While in not in the field either, I do know that it is quite unusual in computer science academics to publish in actual peer reviewed journals. This is because it can be a long process, and the field is very fast moving, so your results would be outdated by the time you publish. Thus, a paper is typically synonymous with a conference proceeding, and can be found on arxiv. I found this Paper on the arxiv from 2017/2018 which seems to be when this paper was originally published for the scientific community and presented at a very “good” (if I had to guess) conference. Google scholar says this paper has 650 citations, so it probably has had quite some impact. However, I would guess this method is well known and is already implemented in many models, if it was truly disruptive.

    • KingRandomGuy@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      For reference, ICML is one of the most prestigious machine learning conferences alongside ICLR and NeurIPS.

    • A_A@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Good to know, Thanks.
      Yours is the type of comment I was really hoping to read here.

      You are right : it’s the same authors (Brenden M. Lake & Marco Baroni) with mostly the same content.

      But, they also write (in nature) that modern systems (GPT-4) do not yet incorporate these abilities :

      Preliminary experiments reported in Supplementary Information 3 suggest that systematicity is still a challenge, or at the very least an open question, even for recent large language models such as GPT-4.

      • DigitalMus@feddit.dk
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        This certainly could be part of the motivation for publishing it this way, to make themselves more noticed by the big players. Btw, publishing in open source nature is expensive, it’s like 6-8000 euro for the big ones, so there definitely is a reason.

    • Chobbes@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      1 year ago

      To be clear, the papers at conferences undergo a peer review process as well. There are journal publications in CS, but a lot of publishing is done through conferences. Arxiv, while a great resource, has little to do with the conferences and it is worth noting that the papers on arxiv do not go through a peer review process (but are often published at conferences where the paper has gone under peer review — some papers on arxiv may be preprint versions from before the peer review process).