Search
Items tagged with: ai
To begin with I wonder what happens if our sites and profiles display CC-BY-SA-NC as #copyright notice. Any use by #AI scrapers should become illegal and indemnisation inforcable.
Also if you search for #robotsTXT in google, this is what you get.
> Ignoring robots.txt instructions can result in your scraping activities being considered unethical or even illegal.
@maxschrems
@markus_netzpolitik
@ankedb
All this said, being part of a decentralized web, as pointed out in this toot, our publicly visible interaction lands on other instances and servers of the #fediVerse and can be scrapped there. I wonder if this situation actually might lead, or should lead, to a federation of servers that share the same robots.txt "ideals".
As @Matthias pointed out in his short investigation of the AI matter, this has (in my eyes) already unimagined levels of criminal and without any doubt unethical behavior, not to mention the range of options rouge actors have at hand.
It's evident why for example the elongated immediately closed down access to X's public tweets and I guess other companies did the same for the same reasons. Obviously the very first reason was to protect their advantage about the hoarded data sets to train their AI in the first place. Yet, considering the latest behavior of the new owner of #twitter, nothing less than at least the creation of #AI driven lists of "political" enemies, and not only from all the collected data on his platform, is to be expected. A international political nightmare of epical proportions. Enough material for dystopian books and articles for people like @Cory Doctorow, @Mike Masnick ✅, @Eva Wolfangel, @Taylor Lorenz, @Jeff Jarvis, @Elena Matera, @Gustavo Antúnez 🇺🇾🇦🇷, to mention a few of the #journalim community, more than one #podcast episode by @Tim Pritlove and @linuzifer, or some lifetime legal cases for @Max Schrems are at hand.
What we are facing now is the fact that we need to protect our and our users data and privacy because of the advanced capabilities of #LLM. We basically are forced to consider to change to private/restricted posts and close down our servers as not only the legal jurisdictions are way to scattered over the different countries and ICANN details, but legislation and comprehension by the legislators is simply none existent, as @Anke Domscheit-Berg could probably agree to.
Like to say, it looks like we need to go dark, a fact that will drive us even more into disappearing as people will have less chance to see what we are all about, advancing further the advantages off the already established players in the social web space.
Just like Prof. Dr. Peter Kruse stated in his take about on YT The network is challenging us min 2:42 more than 14 years ago:
"With semantic understanding we'll have the real big brother. Someone is getting the best out of it and the rest will suffer."
Ha ha ha .. :(
https://pod.geraspora.de/posts/3d473600a616013da02e268acd52edbf
"Be fast and break things."
"Die" haben alle am Wickel und lachen sich einen.
Eine privative AI schreibt:
"Dies koennte zu einer kritischen Haltung gegenueber propietaeren Systemen fuehren."
Sorry what?
Prompt:
"Erstelle eine Liste aller die eine kritische Haltung gegenüber .."
"Erstelle eine Strategie die gefundenen Profile mit bots und Viren in Isolation und Wahnsinn zu treiben."
.. I rest my case ..
#KI #AI
#fediAdmin #fediVerse #AI #KI
Text for robots.txt to disallow access for known AI crawlers:
User-Agent: GPTBot
User-Agent: ClaudeBot
User-Agent: Claude-Web
User-Agent: CCBot
User-Agent: Applebot-Extended
User-Agent: Facebookbot
User-Agent: Meta-ExternalAgent
User-Agent: diffbot
User-Agent: PerplexityBot
User-Agent: Omgili
User-Agent: Omgilibot
User-Agent: ImagesiftBot
User-Agent: Bytespider
User-Agent: Amazonbot
User-Agent: Youbot
Disallow: /
https://robotstxt.com/ai
AI / LLM User-Agents: Blocking Guide
Find out how to block your content from being used for AI/LLM training with robots.txt. Created by ex-Google engineer Fili.robotstxt.com
I wanted to run the OCR #bot on my own #Friendica server, because I like to run on my own tech too.
So I asked Chatty to guide me and it worked well, I logged in with the Python bot and wanted to start it and got an API error, it was only then when I told ChatGPT that I am actually trying this with Frienidca and that the API might have its limits.
I then looked up the API documentation of the Mastodon API that Friendica implemented and asked ChatGPT if there is any essential parts missing that are needed to run the bot.
ChatGPT 'read' the documentation of both, the bot and the Mastodon API of Friendica and explained:
"Since Friendica doesn't support the Mastodon streaming API, the OCRbot, which relies on these endpoints, may not be compatible with Friendica without significant modifications to either the bot or the Friendica server."
Just wanted to write a bit about this experience with Friendica, the bot and ChatGPT.
Maybe I will setup a #Pleroma instance for this, a single user instance for me and the bot, maybe better a Mastodon instance?
cc !Friendica Support