A software developer and Linux nerd, living in Germany. I’m usually a chill dude but my online persona doesn’t always reflect my true personality. Take what I say with a grain of salt, I usually try to be nice and give good advice, though.

I’m into Free Software, selfhosting, microcontrollers and electronics, freedom, privacy and the usual stuff. And a few select other random things as well.

  • 0 Posts
  • 76 Comments
Joined 4 years ago
cake
Cake day: August 21st, 2021

help-circle
  • I’m not sure what youtube2peertube is. But Peertube has a feature to import and synchronize Youtube channels built in these days (needs to be enabled by the instance admin). It somewhat works. But Youtube applies heavy rate-limiting and also often blocks datacenter IP addresses and VPNs so it might not work on a vServer, just on a residential internet connection.








  • Yes, thanks. Just invalidating or trimming the memory doesn’t cut it. OP wants it erased so it needs to be one of the proper erase commands. I think blkdiscard also has flags for that, so I believe you could do it with that command as well, if it’s supported by the device and you append the correct options. (zero, secure) I think other commands are easier to use (if supported).



  • Well, in fact it can. That’s “overprovisioning”. The SSD has some amount of reserved space as replacement for bad cells, and maybe to speed things up. So if you overwrite 100% of what you’ve access to on the SSD, you’d still have X amount of data you didn’t catch. But loosely speaking you’re right. If you overwrite the entire SSD and not just files or one partition or something like that, you’d force it to replace most of the content.
    I wouldn’t recommend it, though. There is secure erase, blkdiscard and some nvme format commands which do it the right way. And ‘dd’ is just a method that get’s it about right (though not 100%) in one specific case.






  • hendrik@palaver.p3x.detoLinux@lemmy.mlHow to backup around 200 DVD
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    19 days ago

    I think the best bet to preserve them as is, would be dd or ddrescue (if there are read errors). You might be able to write a small shell script to automate stuff. For example open the tray, read a filename from the user, then close the tray, rip it and then repeat. That way you’ll notice the open tray, change disks, enter the tiltle and hit enter and come back 10mins later. Obviously takes something like 20 days if you do 10 each day. And you’re looking for roughly 1TB of storage, if it’s single layer DVDs.


  • Is a SSD’s cache even about wear? I mean wear only happens on write operations. And I would expect a SSD to apply the writes as fast as possible. Since piling up work (a filled write cache) means additional latency and less performance on the next, larger write operation. Along with a few minor issues like possible data loss on (power) failure.
    And on read, a cache on the wrong side of the bottleneck doesn’t do that much. A SSD has pretty much random access to all the memory, it’s not like it has to wait for a mechanical head to move into position on the platter for data to become available?!

    But I haven’t looked this up. I might be wrong. What I usually do is make sure a computer has enough RAM and it is used properly. That will also cache data and avoid unneccessary transfers. And RAM is orders of magnitude faster, you can get gigabytes worth of it for a few tens of dollars… Though adding RAM might not be easily done on the more recent Thinkpads… I’ve noticed they come with a maximum of one RAM slot for some years already, sometimes none and it’s soldered.


  • I think a SATA connection might be the bottleneck with its maximum throughput of 600 MB/s. So for that use-case you don’t need to be worried about the SSDs speed and cache, it won’t be able to perform due to the SATA slot. But I don’t know how exactly you plan to repurpose it later. Maybe skip the adapter if it’s expensive, buy a cheap SATA SSD now and a new, fast PCIe one in a few years once you get a new computer.


  • Yeah, sure. No offense. I mean we have different humans as well. I got friends who will talk about a subject and they’ve read some article about it and they’ll tell me a lot of facts and I rarely see them make any mistakes at all or confuse things. And then I got friends who like to talk a lot, and I better check where they picked that up.
    I think I’m somewhere in the middle. I definitely make mistakes. But sometimes my brain manages to store where I picked something up and whether that was speculation, opinion or fact, along with the information itself. I’ve had professors who would quote information verbatim and tell roughly where and in which book to find it.

    With AI I’m currently very cautious. I’ve seen lots of confabulated summaries, made-up facts. And if designed to, it’ll write it in a professional tone. I’m not opposed to AI or a big fan of some applications either. I just think it’s still very far away from what I’ve seen some humans are able to do.


  • I think the difference is that humans are sometimes aware of it. A human will likely say, I don’t know what Kanye West did in 2018. While the AI is very likely to make up something. And also in contrast to a human this will likely be phrased like a Wikipedia article. While you can often look a human in the eyes and know whether they tell the truth or lie, or are uncertain. Not always, and we also tell untrue things, but I think the hallucinations are kind of different in several ways.


  • I’m not a machine learning expert at all. But I’d say we’re not set on the transformer architecture. Maybe just invent a different architecture which isn’t subject to that? Or maybe specifically factor this in. Isn’t the way we currently train LLM base models to just feed in all text they can get? From Wikipedia and research papers to all fictional books from Anna’s archive and weird Reddit and internet talk? I wouldn’t be surprised if they start to make things up since we train them on factual information and fiction and creative writing without any distinction… Maybe we should add something to the architecture to make it aware of the factuality of text, and guide this… Or: I’ve skimmed some papers a year or so ago, where they had a look at the activations. Maybe do some more research what parts of an LLM are concerned with “creativity” or “factuality” and expose that to the user. Or study how hallucinations work internally and then try to isolate this so it can be handled accordingly?