Commit graph

772 commits

Author SHA1 Message Date
pezcurrel
44b6456695 Made check on peers more strict, made the script abandon checking when a peers entry is malformed 2022-12-26 15:27:14 +01:00
pezcurrel
d301d25fcc Removed witches.live, have to check why on the server it takes so long to check its peers, here it doesn’t, it’s probably just that we should increase the server ram and-or cpu 2022-12-26 15:02:50 +01:00
pezcurrel
3f7a5ff69c Moved $graceline definition up, fixing a bug; made peers checking msgs more informative 2022-12-26 15:01:35 +01:00
pezcurrel
97f5b99654 Removed ckpeers function, it was overkill; added a preliminary check for “stringness” to the checks on each peer 2022-12-26 14:51:42 +01:00
pezcurrel
a4b2ae731c Added witches.live since its peers list is huge and full of .activitypub-troll.cf 2022-12-26 14:50:26 +01:00
pezcurrel
926b1b0d73 Added ckpeers function to check if the json array returned by api/v1/instance/peers is well formed 2022-12-26 14:12:06 +01:00
pezcurrel
3c1621df1d Added “LastOkCheckTS” to $instints (array of Instances columns of integer type) 2022-12-26 13:29:19 +01:00
pezcurrel
e820845775 Consider instances which have LastOkCheckTS=null but InsertTS>=$graceline as not dead, and to be checked 2022-12-26 13:28:09 +01:00
pezcurrel
d6d8adee49 Updated to last changes 2022-12-26 12:30:56 +01:00
pezcurrel
61da12100f Removed comment with Instances columns 2022-12-26 12:29:08 +01:00
pezcurrel
ebc458cc2c Removed “resurrect” option and references to Instances.Dead 2022-12-26 12:28:21 +01:00
pezcurrel
d18775fd52 Defines $gracetime, first commit 2022-12-26 12:27:08 +01:00
pezcurrel
1e1b2a99e9 Dropped Instances.Dead, using Instances.LastOkCheckTS now instead 2022-12-26 12:25:15 +01:00
pezcurrel
9b3cca9a45 Script to reasonably set Instances.LastOkCheckTS, first commit 2022-12-26 12:21:47 +01:00
pezcurrel
95b9ccfc31 Renamed “LastCheckOk” to “WasLastCheckOk” 2022-12-26 05:30:35 +01:00
pezcurrel
00caa1dcb9 Changed default for “deadline” option from 62 to 31 days 2022-12-26 05:17:59 +01:00
pezcurrel
44f437c928 Translated initial comment, made it more terse 2022-12-26 05:09:09 +01:00
pezcurrel
5312aea0cc Added writing server rules in the db 2022-12-26 05:08:17 +01:00
pezcurrel
337eb32f51 Made mail optional, inactive by default 2022-12-25 23:45:31 +01:00
pezcurrel
429ab42ff5 Added 2 debug messages stating how mani dead instances the script got from Instances and Peers tables 2022-12-25 18:55:54 +01:00
pezcurrel
e3968a5ace Updated to last changes 2022-12-25 18:54:55 +01:00
pezcurrel
119c9119c2 Made the logic for “deadline” much more terse 2022-12-25 18:41:13 +01:00
pezcurrel
acde202b2e Made it write summary from crawl run into a log file of its own 2022-12-25 18:40:21 +01:00
pezcurrel
ba171bd5f2 Use specifically bash since we use &> redirect 2022-12-25 11:43:30 +01:00
pezcurrel
ec9b65e42f Little script to run peerscrawl.php in loop; first commit 2022-12-25 11:32:49 +01:00
pezcurrel
10e2e1b58a Added “lecho” for “message levels”, removed “gecho”, removed “verbose” option; removed “loop” option (do loop from a shell script if needed) 2022-12-25 11:32:08 +01:00
pezcurrel
1d0c6b799a Small edit to “logminmsglev” and “tuiminmsglev” TUI option parsing errors 2022-12-25 11:29:34 +01:00
pezcurrel
d95bc70b8a Exposed “deadline” option; minor changes 2022-12-25 09:47:04 +01:00
pezcurrel
c0802de828 Removed “restore” option: could work, but it’s not very useful and would require a big hassle; added loops and new found instances counters; made sighandler use mexit 2022-12-25 09:24:23 +01:00
pezcurrel
d6b77b0e29 Removed option “-p peers” from crawler cmdline because now peerscrawl directly writes new instances into the db 2022-12-24 08:59:33 +01:00
pezcurrel
9fabb3853b Infatti 2022-12-23 19:13:37 +01:00
pezcurrel
96aa6f3aa9 Quella roba lì 2022-12-23 19:12:18 +01:00
pezcurrel
05fed0142c Lowered a bit default values for “timeout” and “curltimeout” 2022-12-23 11:23:32 +01:00
pezcurrel
89a2ea0b26 Fixed “trending tags” ordering and fetching 2022-12-23 11:22:25 +01:00
pezcurrel
edee66b834 Temporarily disabled “restore” option because it needs more work to actually work 2022-12-22 15:32:30 +01:00
pezcurrel
61d0fcb3d8 Added “loop” option allowing to run the crawl in an infinite loop or until sig(int|hup|term) received; other minor changes 2022-12-22 15:05:55 +01:00
pezcurrel
6477e8812f Exposed “curltimeout” option; changed “timeout” default from 5 to 10; changed “curltimeout” default from 10 to 20 2022-12-22 14:24:48 +01:00
pezcurrel
6d4ce26f98 Adapted “restore” code to the new workings; minor changes and fixes 2022-12-22 14:04:29 +01:00
pezcurrel
a47ccdd5e2 Merge branch 'weblate': new strings have been translated to russian 2022-12-22 11:41:12 +01:00
pezcurrel
562639fb5c Updated after adding Peers and PeersCheck tables 2022-12-22 11:37:00 +01:00
pezcurrel
c27053314a Added code to store and consider “instance checks” made by the script to independently mark peers ad dead 2022-12-22 11:32:18 +01:00
pezcurrel
f6bc6a12d4 Made it output full admin account address, linked to their profile page 2022-12-22 11:29:44 +01:00
pezcurrel
706c831e23 Little change in a message 2022-12-22 11:28:29 +01:00
pezcurrel
335944add8 Removed “minimum number of known instances” from criteria, because domain_count is gone from /api/v2/instance 2022-12-22 07:51:51 +01:00
pezcurrel
f8cdf2cf3b Changed check against “activity” values, which are strings, not integers 2022-12-22 07:40:41 +01:00
pezcurrel
c6c3feb500 Removed leftovers of “jsonwrite” option 2022-12-22 07:05:21 +01:00
pezcurrel
277296512c Explicitly set idn_to_ascii flags, otherwise with php 7.3 it complained 2022-12-21 22:15:40 +01:00
pezcurrel
9316e686b9 Bir rewrite, made it shorter and hopefully a bit more readable 2022-12-21 22:07:05 +01:00
pezcurrel
732ea79480 Moved $mastodons definition upper 2022-12-21 22:06:10 +01:00
pezcurrel
1c524ffd69 Moved mysqli_close after the optional loading of dead instances from the db; renamed $eta to $tet 2022-12-21 22:05:15 +01:00
pezcurrel
0d74dbf243 Got rid of akeavinn; other minor changes 2022-12-21 07:54:11 +01:00
pezcurrel
d803d6f667 When curl error is unknown, the message is now “reason unknown” instead of “unknown” 2022-12-21 06:54:58 +01:00
pezcurrel
2c86580bfb The regexp to decide whether an instance is Mastodon or not is now based on the Platforms table; made ckratelimit more precise about possible missing headers; added code to set Version from /api/v1/instance when it was not already set from nodeinfo 2022-12-21 06:53:31 +01:00
pezcurrel
d42499747e Updated to new version: added indexes; changed Instances.AdmCreatedAt from float to int 2022-12-20 23:05:02 +01:00
pezcurrel
4fdf287686 Little change in delete prompt 2022-12-20 23:02:53 +01:00
pezcurrel
d8f15f4b3a Does delete from UserFields much faster 2022-12-20 23:01:13 +01:00
pezcurrel
f6dc080ed6 Major rewrite: it was a mess, now it is less :-) 2022-12-20 23:00:22 +01:00
pezcurrel
f3081612da Multibyte lower case first char of a string; first commit 2022-12-20 22:59:24 +01:00
Alex Maryson Jr
d2214b6ef3
Translated using Weblate (Russian)
Currently translated at 55.0% (219 of 398 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/ru/
2022-12-20 19:47:12 +01:00
pezcurrel
8eaeaf0d96 Removed references to New, Good, Chosen columns, since they’re gone 2022-12-18 18:45:39 +01:00
pezcurrel
d07b5e4efd “New” is now based on “InsertTS” 2022-12-18 18:44:33 +01:00
pezcurrel
bee3a04b19 Updated to last changes 2022-12-18 18:43:38 +01:00
pezcurrel
f30bcc7e5e Updated to new delinstbyuid function form 2022-12-18 18:43:09 +01:00
pezcurrel
e744ae9d20 Fixed validhostname; added and using myq function for mysql queries; changed some exit codes; made connection error management better 2022-12-18 18:42:11 +01:00
pezcurrel
f29f636a70 Added mysql error code to error message that is echoed if a query fails 2022-12-18 18:40:22 +01:00
pezcurrel
cd16eabce5 Using new form of delinstbyuid function; removed references to New column 2022-12-18 18:27:22 +01:00
pezcurrel
835e02c171 Removed code referencing New, Good and Chosen columns 2022-12-18 18:26:03 +01:00
pezcurrel
f4aa3cb804 A script to set InsertTS field to something suitable when it is null 2022-12-18 18:24:35 +01:00
pezcurrel
c5debcb463 A function to delete an Instance record by ID, and all references to it in other tables; first commit 2022-12-18 18:23:52 +01:00
pezcurrel
820bc3180d Made browser language detection more flexible (eg: if browser language has pt, use pt_BR since we don’t have pt yet, if ever) 2022-12-18 11:46:01 +01:00
pezcurrel
ccc9f517fd Moved code to delete an Instances record and all its references in other tables to function delinstbyid in lib/delinstbyid.php; minor changes 2022-12-18 11:44:11 +01:00
pezcurrel
2b0e2398ae Made validhostname accept only valid hostnames :-)) (no ports or path specs) 2022-12-18 11:42:32 +01:00
pezcurrel
32251d1ba8 Added “deleteinstswhere” action 2022-12-18 11:41:09 +01:00
pezcurrel
7d2875075b Deleted 2022-12-18 09:35:24 +01:00
pezcurrel
690b54521b Moved 2022-12-18 09:35:04 +01:00
pezcurrel
f269bb901d Little cosmetic change 2022-12-18 07:00:49 +01:00
pezcurrel
e9b88d6735 Made $jsonfp be written into run dir 2022-12-18 07:00:19 +01:00
pezcurrel
a3ada274e7 Removed stdout/err redirect in cmd; passing proper descriptor and pipe to proc_open; minor changes 2022-12-18 06:59:25 +01:00
pezcurrel
a32a25e095 Many many changes :-)) 2022-12-18 00:34:27 +01:00
pezcurrel
d6dd03694c Removed $context 2022-12-17 22:54:32 +01:00
pezcurrel
441d16a42d ckratelimit goes to sleep only when x-ratelimit-remaining==0; can spit debug info; limit fetching chunks from users directories is now 40 2022-12-17 22:54:02 +01:00
pezcurrel
ca4367b719 Removed the unlinking attempt at lockfp before exit: it was already done before by shutdown; other little changes about open files closing and the like 2022-12-17 18:43:13 +01:00
pezcurrel
e5ad18e619 Fixed a typo 2022-12-17 18:40:55 +01:00
pezcurrel
5c605cbe5b Some little cosmetic (readability of log files) changes 2022-12-17 18:40:22 +01:00
pezcurrel
2571396253 Tuned to recent changes in crawler.php (and getinstinfo.php) 2022-12-17 17:36:46 +01:00
pezcurrel
ad8fa26306 Made mysql connection and charset setting errors more graceful; added “users” page to updstats; other minor changes 2022-12-17 17:35:35 +01:00
pezcurrel
2d1d28b002 Fixed regexp checking if max_charcters is an integer; made mexit use eecho again, moving the closing of logf after eecho(s); made logf be opened only if logminmsglev < 4 2022-12-17 17:33:46 +01:00
pezcurrel
d1f088a026 Command for subprocesses gets now built on the fly using cmd function; logfile doesn’t get opened if logminmsglev < 4; other minor changes 2022-12-17 17:31:24 +01:00
pezcurrel
6d897cfdff Removed “crawlernew” directory 2022-12-17 15:03:11 +01:00
pezcurrel
c7d5b50377 Adapted to new crawler version 2022-12-17 15:02:52 +01:00
pezcurrel
0b9e892aef Splitted old crawler.php in 2; this is the part that coordinates 2022-12-17 15:02:20 +01:00
pezcurrel
7629a1caae Moved from subdir “crawlernew” 2022-12-17 15:00:36 +01:00
pezcurrel
b46469bfbb Cope with mysql errors even with php ver. < 8; check if $link is false before trying to close mysql connection in function mexit 2022-12-16 22:39:51 +01:00
pezcurrel
e46a82d923 Added suffix “s” to option “-t” in $cmd definition; cope with mysql errors even with php ver. < 8; other small changes 2022-12-16 22:38:16 +01:00
pezcurrel
17164166ee Merge branch 'weblate' for newly translated strings 2022-12-16 22:03:29 +01:00
Ігор Андреєв
b4adda3e10
Translated using Weblate (Ukrainian)
Currently translated at 100.0% (398 of 398 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/uk/
2022-12-16 22:01:47 +01:00
J. Lavoie
7087cafb3f
Translated using Weblate (Italian)
Currently translated at 100.0% (398 of 398 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/it/
2022-12-16 22:01:46 +01:00
J. Lavoie
62e4c1fa0f
Translated using Weblate (French)
Currently translated at 98.7% (393 of 398 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/fr/
2022-12-16 22:01:46 +01:00
pezcurrel
3804171253 Renamed getfc to gurl 2022-12-16 21:59:26 +01:00
pezcurrel
1bdd59dca7 Deleted 2022-12-16 21:58:06 +01:00
pezcurrel
d7ffe48214 Renamed from getfc.php 2022-12-16 21:57:45 +01:00
pezcurrel
a1a6d1ec5d Renamed “getfc” to “gurl” 2022-12-16 21:35:38 +01:00
pezcurrel
fe99014208 Made “CURLOPT_TIMEOUT” variable, set it to 45 secs by default 2022-12-16 21:24:42 +01:00
pezcurrel
d75a6445ae Fixed sleep and relative message to actually use $options['udirfailst']; changed some options’ default; changed a bit the help text; changed a bit some messages in the “fetchusers” section 2022-12-16 21:23:14 +01:00
pezcurrel
b666fcfdda Use parsetime for “--timeout” too, changed a bit the help text accordingly 2022-12-16 21:19:34 +01:00
pezcurrel
2d31b7ca79 Changed a bit how option “jsonwrite” works; other minor changes 2022-12-16 19:25:45 +01:00
pezcurrel
f193bd294c A very tiny db test, first commit 2022-12-16 19:12:17 +01:00
pezcurrel
1deed93b6f Some preliminary work reusing “index.php” code 2022-12-16 19:11:39 +01:00
pezcurrel
6ff4752dd7 Made it cope with time values < 1 second; changed the way of passing suffixes; other minore changes 2022-12-16 19:10:29 +01:00
pezcurrel
d59eb396eb Added preliminary support to “Users” page 2022-12-16 19:08:39 +01:00
pezcurrel
9529a938a7 Lots of changes, not very important :-)) 2022-12-16 19:06:47 +01:00
pezcurrel
bab5bd5dd2 Added mysqli_query error management for older php versions to function myq; minor changes 2022-12-16 19:05:46 +01:00
pezcurrel
4522bc3ea8 Added mysqli_query error management for older php versions to function myq 2022-12-16 19:02:41 +01:00
pezcurrel
513981b7e2 Fetches info from an instance, first commit 2022-12-16 00:00:37 +01:00
pezcurrel
1cafbe05ea Crawler new version, “multithreaded”, coordinator script, first commit 2022-12-16 00:00:06 +01:00
pezcurrel
1430cd80fb Time spec parser, first commit 2022-12-15 23:57:31 +01:00
pezcurrel
fdc7ddbd1f Made it use an array for output 2022-12-15 23:52:32 +01:00
pezcurrel
f7c32d00ef Made it cope with “accept-encoding” more properly 2022-12-15 14:38:26 +01:00
pezcurrel
8cf8c416ed Made it cope with gzip encoded content 2022-12-15 12:45:52 +01:00
pezcurrel
3a720a90ac Added a warning when nodeinfo specs couldn’t be fetched; made it set New=1 even when host doesn’t respond and is not in the db 2022-12-15 12:45:20 +01:00
Affir Vega
d9ccf5cdfe
Translated using Weblate (Russian)
Currently translated at 51.7% (206 of 398 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/ru/
2022-12-13 13:48:32 +01:00
nichu42
f17453b0c9
Translated using Weblate (German)
Currently translated at 98.4% (392 of 398 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/de/
2022-12-13 13:48:32 +01:00
pezcurrel
9360bdc481 Made false positives for “IsMastodon” less likely (impossible?) 2022-12-12 22:40:17 +01:00
pezcurrel
483fbcd103 Added the possibility, for each set of records with the same URI, to choose one record to keep and delete the others, or to automatically keep the record with the lowest ID and delete the others 2022-12-12 21:32:35 +01:00
pezcurrel
90e85f8182 Took away “-t 10” option from crawler.php calls since 10 is now its default timeout 2022-12-12 17:06:51 +01:00
pezcurrel
e07fba673d Fixed call to non-existent function ”mysq” to “myq” 2022-12-12 08:36:18 +01:00
pezcurrel
ec6324fb4f Made langs() shorten to a maximum of 5 elements the $languages array 2022-12-12 08:29:18 +01:00
pezcurrel
d80ba5ddc4 Fixed double “,” in a query inside langs() 2022-12-12 08:24:26 +01:00
pezcurrel
6f9260e08e myq() did not return results, now it does 2022-12-12 08:17:01 +01:00
pezcurrel
f6752a34bc Added function “myq” as a wrapper for mysqli_query managing exceptions; used it throughout the whole script 2022-12-12 08:12:29 +01:00
pezcurrel
2649e7d137 Info from nodeinfo didn’t end up into $info, now they do 2022-12-12 00:47:06 +01:00
pezcurrel
879a86a211 Merge branch 'weblate' with newly pt_BR translated strings 2022-12-11 23:38:08 +01:00
Fábio Rodrigues Ribeiro
5cc375af50
Translated using Weblate (Portuguese (Brazil))
Currently translated at 100.0% (398 of 398 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/pt_BR/
2022-12-11 23:36:28 +01:00
pezcurrel
783e54a9f9 Little script to convert a unix timestamp in input to a date 2022-12-11 23:33:09 +01:00
pezcurrel
1701a2cfe6 Little script to search for records with the same “URI” in Instances table 2022-12-11 23:32:42 +01:00
pezcurrel
853be9f0e0 Little script to fix uris beginning with “https://” 2022-12-11 23:31:47 +01:00
pezcurrel
b16515f4e8 Lots of changes :-)) 2022-12-11 23:29:51 +01:00
pezcurrel
882222bdb9 Added “Code” section to “About” page 2022-12-11 05:38:42 +01:00
pezcurrel
7e52c5683f Merge newly translated strings from branch 'weblate' 2022-12-11 05:22:41 +01:00
Ігор Андреєв
718efc01f9
Translated using Weblate (Ukrainian)
Currently translated at 100.0% (396 of 396 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/uk/
2022-12-11 05:20:52 +01:00
pezcurrel
72655faa88 Added a “Code” section linking git.lattuga.net, and a link to Ігор Андреєв’s Mastodon profile 2022-12-11 05:18:26 +01:00
pezcurrel
61ad655a62 Disabled fetching profile’s page when “noindex” is not set in account because it takes too long; disabled featured tags fetching fro the same reason; other minor changes 2022-12-10 23:32:58 +01:00
pezcurrel
f343cb702e Changed some eecho messages importance 2022-12-10 13:57:30 +01:00
pezcurrel
4b7f6a199c Added truncs where needed; added code to check for “noindex” on user’s profile page when “noindex” is not set in accounts info 2022-12-10 12:35:22 +01:00
pezcurrel
eeb3ce6edf Made it wait up to 30 seconds for contents 2022-12-10 12:33:21 +01:00
Pongrèbio
4c721f7bad
Translated using Weblate (Russian)
Currently translated at 46.4% (184 of 396 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/ru/
2022-12-09 23:47:47 +01:00
Pongrèbio
b39aa9e284
Translated using Weblate (Persian)
Currently translated at 77.5% (307 of 396 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/fa/
2022-12-09 23:47:46 +01:00
J. Lavoie
3feca8f402
Translated using Weblate (French)
Currently translated at 98.7% (391 of 396 strings)

Translation: mastodon.help/Site
Translate-URL: https://hosted.weblate.org/projects/mastodon-help/site/fr/
2022-12-09 23:47:46 +01:00
pezcurrel
18ce06871b Added ckratelimit() where useful; made it more flexible with lowercasing every header key; more work on fetching users from users directories 2022-12-09 22:53:18 +01:00
pezcurrel
8341f0e209 Fixed a cosmetic bug; some more work into users directories fetching 2022-12-09 19:25:44 +01:00