• 0 Posts
  • 35 Comments
Joined 1 year ago
cake
Cake day: July 5th, 2023

help-circle












  • There is an episode of Tech Won’t Save Us (2024-01-25) discussing how weird the podcasting play was for Spotify. There is essentially no way to monetize podcasts at scale, primarily because podcasts do not have the same degree of platform look-in as other media types.

    Spotify spent the $100 million (or whatever the number was) to get Rogan exclusive, but for essentially every other podcast you can find a free RSS feed with skippable ads. Also their podcast player just outright sucks :/


  • Errrrm… No. Don’t get your philosophy from LessWrong.

    Here’s the part of the LessWrong page that cites Simulacra and Simulation:

    Like “agent”, “simulation” is a generic term referring to a deep and inevitable idea: that what we think of as the real can be run virtually on machines, “produced from miniaturized units, from matrices, memory banks and command models - and with these it can be reproduced an indefinite number of times.”

    This last quote does indeed come from Simulacra (you can find it in the third paragraph here), but it appears to have been quoted solely because when paired with the definition of simulation put forward by the article:

    A simulation is the imitation of the operation of a real-world process or system over time.

    it appears that Baudrillard supports the idea that a computer can just simulate any goddamn thing we want it to.

    If you are familiar with the actual arguments Baudrillard makes, or simply read the context around that quote, it is obvious that this is misappropriating the text.


  • The reason the article compares to commercial flights is your everyday reader knows planes’ emissions are large. It’s a reference point so people can weight the ecological tradeoff.

    “I can emit this much by either (1) operating the global airline network, or (2) running cloud/LLMs.” It’s a good way to visualize the cost of cloud systems without just citing tons-of-CO2/yr.

    Downplaying that by insisting we look at the transportation industry as a whole doesn’t strike you as… a little silly? We know transport is expensive; It is moving tons of mass over hundreds of miles. The fact computer systems even get close is an indication of the sheer scale of energy being poured into them.



  • Spedwell@lemmy.worldtoTechnology@lemmy.worldMapping the Mind of a Large Language Model
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    6 months ago

    concepts embedded in them

    internal model

    You used both phrases in this thread, but those are two very different things. It’s a stretch to say this research supports the latter.

    Yes, LLMs are still next-token generators. That is a descriptive statement about how they operate. They just have embedded knowledge that allows them to generate sometimes meaningful text.




  • The issue on the copyright front is the same kind of professional standards and professional ethics that should stop you from just outright copying open-source code into your application. It may be very small portions of code, and you may never get caught, but you simply don’t do that. If you wouldn’t steal a function from a copyleft open-source project, you wouldn’t use that function when copilot suggests it. Idk if copilot has added license tracing yet (been a while since I used it), but absent that feature you are entirely blind to the extent which it’s output is infringing on licenses. That’s huge legal liability to your employer, and an ethical coinflip.


    Regarding understanding of code, you’re right. You have to own what you submit into the codebase.

    The drawback/risks of using LLMs or copilot are more to do with the fact it generates the likely code, which means it’s statistically biased to generate whatever common and unnoticeable bugged logic exists in the average github repo it trained on. It will at some point give you code you read and say “yep, looks right to me” and then actually has a subtle buffer overflow issue, or actually fails in an edge case, because in a way that is just unnoticeable enough.

    And you can make the argument that it’s your responsibility to find that (it is). But I’ve seen some examples thrown around on twitter of just slightly bugged loops; I’ve seen examples of it replicated known vulnerabilities; and we have that package name fiasco in the that first article above.

    If I ask myself would I definitely have caught that? the answer is only a maybe. If it replicates a vulnerability that existed in open-source code for years before it was noticed, do you really trust yourself to identify that the moment copilot suggests it to you?

    I guess it all depends on stakes too. If you’re generating buggy JavaScript who cares.