I’m a tech interested guy. I’ve touched SQL once or twice, but wasn’t able to really make sense of it. That combined with not having a practical use leaves SQL as largely a black box in my mind (though I am somewhat familiar with technical concepts in databasing).

With that, I keep seeing [pic related] as proof that Elon Musk doesn’t understand SQL.

Can someone give me a technical explanation for how one would come to that conclusion? I’d love if you could pass technical documentation for that.

  • DahGangalang@infosec.pubOP
    link
    fedilink
    arrow-up
    25
    ·
    8 months ago

    A given SSN appearing in multiple tables actually makes sense. To someone not familiar with SQL (i.e. at about my level of understanding), I could see that being misinterpreted as having multiple SSN repeated “in the database”.

    Of all the comments ao far, I find yours the most compelling.

    • Barbarian@sh.itjust.works
      link
      fedilink
      arrow-up
      14
      ·
      edit-2
      8 months ago

      Theoretically, yeah, that’s one solution. The more reasonable thing to do would be to use the foreign key though. So, for example:

      SSN_Table

      ID | SSN | Other info

      Other_Table

      ID | SSN_ID | Other info

      When you want to connect them to have both sets of info, it’d be the following:

      SELECT * FROM SSN_Table JOIN Other_Table ON SSN_Table.ID = Other_Table.SSN_ID

      EDIT: Oh, just to clear up any confusion, the SSN_ID in this simple example is not the SSN itself. To access that in this example query, it’d by SSN_Table.SSN

      • schteph@lemmy.world
        link
        fedilink
        arrow-up
        21
        ·
        8 months ago

        This is true, but there are many instances where denormalization makes sense and is frequently used.

        A common example is a table that is frequently read. Instead of going to the “central” table the data is denormalized for faster access. This is completely standard practice for every large system.

        There’s nothing inherently wrong with it, but it can be easily misused. With SSN, I’d think the most stupid thing to do is to use it as the primary key. The second one would be to ignore the security risks that are ingrained in an SSN. The federal government, being large as it is, I’m sure has instances of both, however since Musky is using his possy of young, arrogant brogrammers, I’m positively certain they’re completely ignoring the security aspect.

        • esa@discuss.tchncs.de
          link
          fedilink
          arrow-up
          8
          ·
          8 months ago

          To be a bit more generic here, when you’re at government scale you’re generally deep in trade-off territory. Time and space are frequently opposed values and you have to choose which one is most important, and consider the expenses of both.

          E.g. caching is duplicating data to save time. Without it we’d have lower storage costs, but longer wait times and more network traffic.

        • DahGangalang@infosec.pubOP
          link
          fedilink
          arrow-up
          7
          ·
          8 months ago

          Yeah, no one appreciates security.

          I probably overused that saying to explain it: ‘if theres no break ins, why do we pay for security? Oh, there was a break in - what do we even pay security for?’

        • Barbarian@sh.itjust.works
          link
          fedilink
          arrow-up
          3
          ·
          8 months ago

          Yeah, I work daily with a database with a very important non-ID field that is denormalized throughout most of the database. It’s not a common design pattern, but it is done from time to time.

      • DahGangalang@infosec.pubOP
        link
        fedilink
        arrow-up
        2
        ·
        8 months ago

        Yeah, databases are complicated and make my head hurt. Glancing through resources from other comments, I’m realizing I know next to nothing about database optimization. Like, my gut reaction to your comment is that it seems like unnecessary overhead to have that data across two tables - but if one sub-dept didn’t need access to the raw SSN, but did need access to less personal data, j could see those stored in separate tables.

        But anyway, you’re helping clear things up for me. I really appreciate the pseudo code level example.

        • Barbarian@sh.itjust.works
          link
          fedilink
          arrow-up
          6
          ·
          8 months ago

          It’s necessary to split it out into different tables if you have a one-to-many relationship. Let’s say you have a list of driver licenses the person has had over the years, for example. Then you’d need the second table. So something like this:

          SSN_Table

          ID | SSN | Other info

          Driver_License_Table

          ID | SSN_ID | Issue_Date | Expiry_Date | Other_Info

          Then you could do something like pull up a person’s latest driver’s license, or list all the ones they had, or pull up the SSN associated with that license.

          • Arcka@midwest.social
            link
            fedilink
            English
            arrow-up
            2
            ·
            8 months ago

            I think a likely scenario would be for name changes, such as taking your partner’s surname after marriage.