@Spedwell - K-Money's Lemmy

Spedwell@lemmy.world · 3 months ago

If we’re doing short stories, I have two recommendations:

Ted Chiang’s Stories of Your Life and Others.
Kurt Vonnegut’s Welcome to the Monkey House.

Spedwell@lemmy.world · 4 months ago

The script doesn’t go away when you replace a helpdesk operator with ChatGPT. You just get a script-reading interface without empathy and a severally hindered ability to process novel issues outside it’s protocol.

The humans you speak to could do exactly what you’re asking for, if the business did not handcuff them to a script.

Spedwell@lemmy.world · 5 months ago

As the article points out, TSA is using this tech to improve efficiency. Every request for manual verification breaks their flow, requires an agent to come address you, and eats more time. At the very least, you ought not to scan in the hopes that TSA metrics look poor enough they decide this tech isn’t practical to use.

Spedwell@lemmy.world · 6 months ago

I’m curious what issue you see with that? It seems like the project is only accepting unrestricted donations, but is there something suspicious about shopify that makes it’s involvement concerning (I don’t know much about them)?

Spedwell@lemmy.world · 6 months ago

Right concept, except you’re off in scale. A MULT instruction would exist in both RISC and CISC processors.

The big difference is that CISC tries to provide instructions to perform much more sophisticated subroutines. This video is a fun look at some of the most absurd ones, to give you an idea.

Spedwell@lemmy.world · 7 months ago

The current assumption made by these companies is that AI training is fair use, and is therefore legal regardless of license. There are still many ongoing court cases over this, but one case was already resolved in favor or the fair use position.

Spedwell@lemmy.world · edit-2 7 months ago

There is an episode of Tech Won’t Save Us (2024-01-25) discussing how weird the podcasting play was for Spotify. There is essentially no way to monetize podcasts at scale, primarily because podcasts do not have the same degree of platform look-in as other media types.

Spotify spent the $100 million (or whatever the number was) to get Rogan exclusive, but for essentially every other podcast you can find a free RSS feed with skippable ads. Also their podcast player just outright sucks :/

Spedwell@lemmy.world · 7 months ago

Errrrm… No. Don’t get your philosophy from LessWrong.

Here’s the part of the LessWrong page that cites Simulacra and Simulation:

Like “agent”, “simulation” is a generic term referring to a deep and inevitable idea: that what we think of as the real can be run virtually on machines, “produced from miniaturized units, from matrices, memory banks and command models - and with these it can be reproduced an indefinite number of times.”

This last quote does indeed come from Simulacra (you can find it in the third paragraph here), but it appears to have been quoted solely because when paired with the definition of simulation put forward by the article:

A simulation is the imitation of the operation of a real-world process or system over time.

it appears that Baudrillard supports the idea that a computer can just simulate any goddamn thing we want it to.

If you are familiar with the actual arguments Baudrillard makes, or simply read the context around that quote, it is obvious that this is misappropriating the text.

Spedwell@lemmy.world · edit-2 7 months ago

The reason the article compares to commercial flights is your everyday reader knows planes’ emissions are large. It’s a reference point so people can weight the ecological tradeoff.

“I can emit this much by either (1) operating the global airline network, or (2) running cloud/LLMs.” It’s a good way to visualize the cost of cloud systems without just citing tons-of-CO2/yr.

Downplaying that by insisting we look at the transportation industry as a whole doesn’t strike you as… a little silly? We know transport is expensive; It is moving tons of mass over hundreds of miles. The fact computer systems even get close is an indication of the sheer scale of energy being poured into them.

Spedwell@lemmy.world · edit-2 7 months ago

concepts embedded in them

internal model

You used both phrases in this thread, but those are two very different things. It’s a stretch to say this research supports the latter.

Yes, LLMs are still next-token generators. That is a descriptive statement about how they operate. They just have embedded knowledge that allows them to generate sometimes meaningful text.

Spedwell@lemmy.world · 8 months ago

It’s not really stupid at all. See the matrix code example from this article: https://spectrum.ieee.org/ai-code-generation-ownership

You can’t really know when the genAI is synthesizing from thousands of inputs or just outright reciting copyrighted code. Not kosher if it’s the latter.

Spedwell@lemmy.world · 8 months ago

I get that there are better choices now, but let’s not pretend like a straw you blow into is the technological stopping point for limb-free computer control (sorry if that’s not actually the best option, it’s just the one I’m familiar with). There are plenty of things to trash talk Neuralink about without pretending this technology (or it’s future form) is meritless.

Spedwell@lemmy.world · 8 months ago

The issue on the copyright front is the same kind of professional standards and professional ethics that should stop you from just outright copying open-source code into your application. It may be very small portions of code, and you may never get caught, but you simply don’t do that. If you wouldn’t steal a function from a copyleft open-source project, you wouldn’t use that function when copilot suggests it. Idk if copilot has added license tracing yet (been a while since I used it), but absent that feature you are entirely blind to the extent which it’s output is infringing on licenses. That’s huge legal liability to your employer, and an ethical coinflip.

Regarding understanding of code, you’re right. You have to own what you submit into the codebase.

The drawback/risks of using LLMs or copilot are more to do with the fact it generates the likely code, which means it’s statistically biased to generate whatever common and unnoticeable bugged logic exists in the average github repo it trained on. It will at some point give you code you read and say “yep, looks right to me” and then actually has a subtle buffer overflow issue, or actually fails in an edge case, because in a way that is just unnoticeable enough.

And you can make the argument that it’s your responsibility to find that (it is). But I’ve seen some examples thrown around on twitter of just slightly bugged loops; I’ve seen examples of it replicated known vulnerabilities; and we have that package name fiasco in the that first article above.

If I ask myself would I definitely have caught that? the answer is only a maybe. If it replicates a vulnerability that existed in open-source code for years before it was noticed, do you really trust yourself to identify that the moment copilot suggests it to you?

I guess it all depends on stakes too. If you’re generating buggy JavaScript who cares.

Spedwell@lemmy.world · 8 months ago

We should already be at that point. We have already seen LLMs’ potential to inadvertently backdoor your code and to inadvertently help you violate copyright law (I guess we do need to wait to see what the courts rule, but I’ll be rooting for the open-source authors).

If you use LLMs in your professional work, you’re crazy. I would never be comfortably opening myself up to the legal and security liabilities of AI tools.

Spedwell@lemmy.world · 8 months ago

That’s significantly worse privacy-wise, since Google gets a copy of everything.

A recovery email in this case was used to uncover the identity of the account-holder. Unless you’re using proton mail anonymously (if you’re replacing your personal gmail, then probably not) then you don’t need to consider the recover email as a weakness.

Spedwell@lemmy.world · 8 months ago

I think it’s more the dual-use nature of defense technology. It is very realistic to assume the tech that defends you here, is also going to be used in armed conflict (which historically for the US, involves in many civilian deaths). To present the technology without that critical examination, especially to a young audience like Rober’s, is irresponsible. It can help form the view that this technology is inherently good, by leaving the adverse consequences under-examined and out of view to children watching this video.

Not that we need to suddenly start exposing kids to reporting on civilian collateral damage, wedding bombings, war crimes, etc… But if those are inherently part of this technology then leaving them out overlooks a crucial outcome of developing these tools. Maybe we just shouldn’t advertise defense tech in kids media?

Spedwell@lemmy.world · 9 months ago

I don’t believe that explanation is more probable. If the NSA had the power to compell Apple to place a backdoor in their chip, it would probably be a proper backdoor. It wouldn’t be a side channel in the cache that is exploitable only in specific conditions.

The exploit page mentions that the Intel DMP is robust because it is more selective. So this is likely just a simple design error of making the system a little too trigger-happy.

Spedwell@lemmy.world · edit-2 9 months ago

Wow, what a dishearteningly predictable attack.

I have studied computer architecture and hardware security at the graduate level—though I am far from an expert. That said, any student in the classroom could have laid out the theoretical weaknesses in a “data memory-dependent prefetcher”.

My gut says (based on my own experience having a conversation like this) the engineers knew there was a “information leak” but management did not take it seriously. It’s hard to convince someone without a cryptographic background why you need to {redesign/add a workaround/use a lower performance design} because of “leaks”. If you can’t demonstrate an attack they will assume the issue isn’t exploitable.

Spedwell@lemmy.world · 11 months ago

What even is federation in the context of a distributed vcs like Git? Does it mean federation of the typical dev ops tools (issues, PRs, etc.)?

Spedwell@lemmy.world · 11 months ago

I have a thing for experimental CAD and modeling softwares, but hadn’t heard of PicoCAD! I’ll have to try it out, thanks for sharing.

Some other cools ones:

OpenSCAD
Dust3D
Antimony
LibFive (same dev as Antimony)