Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Darkly)
  • No Skin
Collapse

Chebucto Regional Softball Club

  1. Home
  2. Uncategorized
  3. This guy generally does interesting work, but he's used an LLM to analyze the trends in a "creation science" journal over time, and I just don't think LLMs are effective for this kind of statistical task.
A forum for discussing and organizing recreational softball and baseball games and leagues in the greater Halifax area.

This guy generally does interesting work, but he's used an LLM to analyze the trends in a "creation science" journal over time, and I just don't think LLMs are effective for this kind of statistical task.

Scheduled Pinned Locked Moved Uncategorized
29 Posts 10 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • myrmepropagandistF myrmepropagandist

    @Moss

    Damn thing will sit there and tell you that's what it's doing.

    But it can't count! It still can't count. I feel like I'm going crazy. Am I the only person who cares that the machine can't even count?

    Dawn AhukannaD This user is from outside of this forum
    Dawn AhukannaD This user is from outside of this forum
    Dawn Ahukanna
    wrote last edited by
    #20

    @futurebird @Moss
    “ But it can't count! It still can't count. I feel like I'm going crazy. Am I the only person who cares that the machine can't even count?” -
    I also feel deep incredulity towards this corporate-grade “confabulation”.

    David Chisnall (*Now with 50% more sarcasm!*)D 1 Reply Last reply
    1
    0
    • myrmepropagandistF myrmepropagandist shared this topic
    • myrmepropagandistF myrmepropagandist

      I mean LLMs are based on statistics, and they will produce results that look like frequency charts. But these charts only attempt to approximate the expected content. They aren't based on counting articles that meet any set of criteria.

      It's... nonsense, and not even people who pride themselves on spotting nonsense seem to understand this.

      ? Offline
      ? Offline
      Guest
      wrote last edited by
      #21

      @futurebird regrettably being that guy:

      In context of how LLM deep research workflows are built, I do think you might need to show your work on this claim more than OP does

      The model is not the only operative mechanism in such an investigation

      In that approach, the model would be invoking (deterministic) tools that, among other things, could log instances of topic areas encountered within a corpus. OP says they are capturing abstracts and authors and grouping them by year. Objectively this category of work is something these tools can be built to do really well, including citations (to real, verifiable URLs). Statistical modeling tasks, including text analysis, can be offloaded to one-off scripts written and executed specifically for a requested job. Perhaps the model can’t tell you the “R” count in strawberry, but it can write Python which does quite well

      Moreover, it is possible to objectively evaluate the performance of these tools for such tasks (and Anthropic, vendor of OP’s research tool, does this)

      I mention all of this because I find this particular flavor of strawman quite pernicious: the limitations of the raw model architecture are entirely possible to mitigate through larger agent and tool scaffolding, and this work is constant, ongoing, and often quite effective. Critique of the technology and its vendors (essential) is meanwhile less effective when claims like this are so easily disproved by experience, usage, and public information

      Here’s a bit more detail on the architecture point.

      Link Preview Image
      How we built our multi-agent research system

      On the the engineering challenges and lessons learned from building Claude's Research system

      favicon

      (www.anthropic.com)

      myrmepropagandistF 1 Reply Last reply
      0
      • ? Guest

        @futurebird regrettably being that guy:

        In context of how LLM deep research workflows are built, I do think you might need to show your work on this claim more than OP does

        The model is not the only operative mechanism in such an investigation

        In that approach, the model would be invoking (deterministic) tools that, among other things, could log instances of topic areas encountered within a corpus. OP says they are capturing abstracts and authors and grouping them by year. Objectively this category of work is something these tools can be built to do really well, including citations (to real, verifiable URLs). Statistical modeling tasks, including text analysis, can be offloaded to one-off scripts written and executed specifically for a requested job. Perhaps the model can’t tell you the “R” count in strawberry, but it can write Python which does quite well

        Moreover, it is possible to objectively evaluate the performance of these tools for such tasks (and Anthropic, vendor of OP’s research tool, does this)

        I mention all of this because I find this particular flavor of strawman quite pernicious: the limitations of the raw model architecture are entirely possible to mitigate through larger agent and tool scaffolding, and this work is constant, ongoing, and often quite effective. Critique of the technology and its vendors (essential) is meanwhile less effective when claims like this are so easily disproved by experience, usage, and public information

        Here’s a bit more detail on the architecture point.

        Link Preview Image
        How we built our multi-agent research system

        On the the engineering challenges and lessons learned from building Claude's Research system

        favicon

        (www.anthropic.com)

        myrmepropagandistF This user is from outside of this forum
        myrmepropagandistF This user is from outside of this forum
        myrmepropagandist
        wrote last edited by
        #22

        @danilo

        Is that what the guy in the video is doing?

        ? 1 Reply Last reply
        0
        • myrmepropagandistF myrmepropagandist

          @danilo

          Is that what the guy in the video is doing?

          ? Offline
          ? Offline
          Guest
          wrote last edited by
          #23

          @futurebird according to what he describes in the methods section of the video, he is doing an entirely plausible research task with a tool well suited to it, yes

          The post I linked describes how it works

          myrmepropagandistF 1 Reply Last reply
          0
          • ? Guest

            @futurebird according to what he describes in the methods section of the video, he is doing an entirely plausible research task with a tool well suited to it, yes

            The post I linked describes how it works

            myrmepropagandistF This user is from outside of this forum
            myrmepropagandistF This user is from outside of this forum
            myrmepropagandist
            wrote last edited by
            #24

            @danilo

            OK but he's saying things about it counting articles (frequency) and when I used the same tool it could not do that accurately. It couldn't even follow a command to restrict the dataset. It do not sound like he used some kind of API to make this kind of task possible.

            1 Reply Last reply
            0
            • Dawn AhukannaD Dawn Ahukanna

              @futurebird @Moss
              “ But it can't count! It still can't count. I feel like I'm going crazy. Am I the only person who cares that the machine can't even count?” -
              I also feel deep incredulity towards this corporate-grade “confabulation”.

              David Chisnall (*Now with 50% more sarcasm!*)D This user is from outside of this forum
              David Chisnall (*Now with 50% more sarcasm!*)D This user is from outside of this forum
              David Chisnall (*Now with 50% more sarcasm!*)
              wrote last edited by
              #25

              @dahukanna @futurebird @Moss

              It’s a shame that it lists summarisation as something LLMs are good at, when all of the studies that measure this show the opposite. LLMs are good at turning text into less text, but summarisation is the process of extracting the key points from text. LLMs will extract things that are shaped in the same way as a statistically large number of key points in the training set but they don’t understand either the text of the document or your context for requesting a summary and so are very likely to discard the thing that you think is most important. They also have a habit of inverting the meaning of sentences when shrinking them.

              myrmepropagandistF 3 Replies Last reply
              1
              0
              • David Chisnall (*Now with 50% more sarcasm!*)D David Chisnall (*Now with 50% more sarcasm!*)

                @dahukanna @futurebird @Moss

                It’s a shame that it lists summarisation as something LLMs are good at, when all of the studies that measure this show the opposite. LLMs are good at turning text into less text, but summarisation is the process of extracting the key points from text. LLMs will extract things that are shaped in the same way as a statistically large number of key points in the training set but they don’t understand either the text of the document or your context for requesting a summary and so are very likely to discard the thing that you think is most important. They also have a habit of inverting the meaning of sentences when shrinking them.

                myrmepropagandistF This user is from outside of this forum
                myrmepropagandistF This user is from outside of this forum
                myrmepropagandist
                wrote last edited by
                #26

                @david_chisnall @dahukanna @Moss

                Why do I have to write the software guide for Google and Sora?

                1 Reply Last reply
                0
                • David Chisnall (*Now with 50% more sarcasm!*)D David Chisnall (*Now with 50% more sarcasm!*)

                  @dahukanna @futurebird @Moss

                  It’s a shame that it lists summarisation as something LLMs are good at, when all of the studies that measure this show the opposite. LLMs are good at turning text into less text, but summarisation is the process of extracting the key points from text. LLMs will extract things that are shaped in the same way as a statistically large number of key points in the training set but they don’t understand either the text of the document or your context for requesting a summary and so are very likely to discard the thing that you think is most important. They also have a habit of inverting the meaning of sentences when shrinking them.

                  myrmepropagandistF This user is from outside of this forum
                  myrmepropagandistF This user is from outside of this forum
                  myrmepropagandist
                  wrote last edited by
                  #27

                  @david_chisnall @dahukanna @Moss

                  Likewise the second question is what the guy in the video at the start of the post *thought* he was doing. But, by introducing counting articles into the task it became something else.

                  1 Reply Last reply
                  0
                  • David Chisnall (*Now with 50% more sarcasm!*)D David Chisnall (*Now with 50% more sarcasm!*)

                    @dahukanna @futurebird @Moss

                    It’s a shame that it lists summarisation as something LLMs are good at, when all of the studies that measure this show the opposite. LLMs are good at turning text into less text, but summarisation is the process of extracting the key points from text. LLMs will extract things that are shaped in the same way as a statistically large number of key points in the training set but they don’t understand either the text of the document or your context for requesting a summary and so are very likely to discard the thing that you think is most important. They also have a habit of inverting the meaning of sentences when shrinking them.

                    myrmepropagandistF This user is from outside of this forum
                    myrmepropagandistF This user is from outside of this forum
                    myrmepropagandist
                    wrote last edited by
                    #28

                    @david_chisnall @dahukanna @Moss

                    I'm not an AI prohibitionist or "hater" however I keep finding the effective use case is much much much more narrow than the UI we have been shown to use these tools would suggest.

                    And a lot of people really seem to find it "easier" than searching the web, which, given the current state of the web isn't saying very much.

                    Has web search been broken to push everyone to the chatbots? (adjusting my tin foil cap here)

                    myrmepropagandistF 1 Reply Last reply
                    0
                    • myrmepropagandistF myrmepropagandist

                      @david_chisnall @dahukanna @Moss

                      I'm not an AI prohibitionist or "hater" however I keep finding the effective use case is much much much more narrow than the UI we have been shown to use these tools would suggest.

                      And a lot of people really seem to find it "easier" than searching the web, which, given the current state of the web isn't saying very much.

                      Has web search been broken to push everyone to the chatbots? (adjusting my tin foil cap here)

                      myrmepropagandistF This user is from outside of this forum
                      myrmepropagandistF This user is from outside of this forum
                      myrmepropagandist
                      wrote last edited by
                      #29

                      @david_chisnall @dahukanna @Moss

                      Imagine inventing electricity and you just give people a live wire to play with.

                      "people are killing themselves"
                      "but look, some of them use the wire carefully to power cool and useful machines. why are you a hater?"

                      1 Reply Last reply
                      0

                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • 1
                      • 2
                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      Powered by NodeBB Contributors
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups