Commit graph

187 commits

Author SHA1 Message Date
pezcurrel
6f5de9730e When server thumb or admin avatar are unavailable, set them to “unavailable” 2022-12-28 07:01:29 +01:00
pezcurrel
cf158ceb73 Now it makes a tar.xz with “run” directory contents and remove them before running commands 2022-12-28 05:25:56 +01:00
pezcurrel
ce66aa56e9 Added instance blocks support 2022-12-27 23:02:31 +01:00
pezcurrel
430c35e17a Little change in progress string 2022-12-27 23:01:39 +01:00
pezcurrel
73469fa012 Default pool size back to 10 2022-12-27 15:39:46 +01:00
pezcurrel
c0b9c19469 Added keys check on instance info fetched from api v2 and v1; subordinated language checks to “$instaswered”; made get_toot_languages better cope with possible errors 2022-12-27 09:05:31 +01:00
pezcurrel
f7f1ac4cb2 logfile has .gii.log extension, handy to select only these files when run from crawler.php 2022-12-26 22:10:20 +01:00
pezcurrel
ea0118d445 cmd now prepends exec to command; pipes get closed 2022-12-26 22:09:18 +01:00
pezcurrel
463ef7cd37 Removed a dangling “}” which was breaking the script 2022-12-26 18:06:54 +01:00
pezcurrel
ff2d6c09a8 Removed “peers.all” file; made the script write to “peers files” only on exit (be it clean or by interruption) 2022-12-26 16:40:10 +01:00
pezcurrel
44b6456695 Made check on peers more strict, made the script abandon checking when a peers entry is malformed 2022-12-26 15:27:14 +01:00
pezcurrel
d301d25fcc Removed witches.live, have to check why on the server it takes so long to check its peers, here it doesn’t, it’s probably just that we should increase the server ram and-or cpu 2022-12-26 15:02:50 +01:00
pezcurrel
3f7a5ff69c Moved $graceline definition up, fixing a bug; made peers checking msgs more informative 2022-12-26 15:01:35 +01:00
pezcurrel
97f5b99654 Removed ckpeers function, it was overkill; added a preliminary check for “stringness” to the checks on each peer 2022-12-26 14:51:42 +01:00
pezcurrel
a4b2ae731c Added witches.live since its peers list is huge and full of .activitypub-troll.cf 2022-12-26 14:50:26 +01:00
pezcurrel
926b1b0d73 Added ckpeers function to check if the json array returned by api/v1/instance/peers is well formed 2022-12-26 14:12:06 +01:00
pezcurrel
3c1621df1d Added “LastOkCheckTS” to $instints (array of Instances columns of integer type) 2022-12-26 13:29:19 +01:00
pezcurrel
e820845775 Consider instances which have LastOkCheckTS=null but InsertTS>=$graceline as not dead, and to be checked 2022-12-26 13:28:09 +01:00
pezcurrel
61da12100f Removed comment with Instances columns 2022-12-26 12:29:08 +01:00
pezcurrel
ebc458cc2c Removed “resurrect” option and references to Instances.Dead 2022-12-26 12:28:21 +01:00
pezcurrel
1e1b2a99e9 Dropped Instances.Dead, using Instances.LastOkCheckTS now instead 2022-12-26 12:25:15 +01:00
pezcurrel
9b3cca9a45 Script to reasonably set Instances.LastOkCheckTS, first commit 2022-12-26 12:21:47 +01:00
pezcurrel
95b9ccfc31 Renamed “LastCheckOk” to “WasLastCheckOk” 2022-12-26 05:30:35 +01:00
pezcurrel
00caa1dcb9 Changed default for “deadline” option from 62 to 31 days 2022-12-26 05:17:59 +01:00
pezcurrel
44f437c928 Translated initial comment, made it more terse 2022-12-26 05:09:09 +01:00
pezcurrel
5312aea0cc Added writing server rules in the db 2022-12-26 05:08:17 +01:00
pezcurrel
337eb32f51 Made mail optional, inactive by default 2022-12-25 23:45:31 +01:00
pezcurrel
429ab42ff5 Added 2 debug messages stating how mani dead instances the script got from Instances and Peers tables 2022-12-25 18:55:54 +01:00
pezcurrel
119c9119c2 Made the logic for “deadline” much more terse 2022-12-25 18:41:13 +01:00
pezcurrel
acde202b2e Made it write summary from crawl run into a log file of its own 2022-12-25 18:40:21 +01:00
pezcurrel
ba171bd5f2 Use specifically bash since we use &> redirect 2022-12-25 11:43:30 +01:00
pezcurrel
ec9b65e42f Little script to run peerscrawl.php in loop; first commit 2022-12-25 11:32:49 +01:00
pezcurrel
10e2e1b58a Added “lecho” for “message levels”, removed “gecho”, removed “verbose” option; removed “loop” option (do loop from a shell script if needed) 2022-12-25 11:32:08 +01:00
pezcurrel
1d0c6b799a Small edit to “logminmsglev” and “tuiminmsglev” TUI option parsing errors 2022-12-25 11:29:34 +01:00
pezcurrel
d95bc70b8a Exposed “deadline” option; minor changes 2022-12-25 09:47:04 +01:00
pezcurrel
c0802de828 Removed “restore” option: could work, but it’s not very useful and would require a big hassle; added loops and new found instances counters; made sighandler use mexit 2022-12-25 09:24:23 +01:00
pezcurrel
d6b77b0e29 Removed option “-p peers” from crawler cmdline because now peerscrawl directly writes new instances into the db 2022-12-24 08:59:33 +01:00
pezcurrel
9fabb3853b Infatti 2022-12-23 19:13:37 +01:00
pezcurrel
96aa6f3aa9 Quella roba lì 2022-12-23 19:12:18 +01:00
pezcurrel
05fed0142c Lowered a bit default values for “timeout” and “curltimeout” 2022-12-23 11:23:32 +01:00
pezcurrel
89a2ea0b26 Fixed “trending tags” ordering and fetching 2022-12-23 11:22:25 +01:00
pezcurrel
edee66b834 Temporarily disabled “restore” option because it needs more work to actually work 2022-12-22 15:32:30 +01:00
pezcurrel
61d0fcb3d8 Added “loop” option allowing to run the crawl in an infinite loop or until sig(int|hup|term) received; other minor changes 2022-12-22 15:05:55 +01:00
pezcurrel
6477e8812f Exposed “curltimeout” option; changed “timeout” default from 5 to 10; changed “curltimeout” default from 10 to 20 2022-12-22 14:24:48 +01:00
pezcurrel
6d4ce26f98 Adapted “restore” code to the new workings; minor changes and fixes 2022-12-22 14:04:29 +01:00
pezcurrel
c27053314a Added code to store and consider “instance checks” made by the script to independently mark peers ad dead 2022-12-22 11:32:18 +01:00
pezcurrel
706c831e23 Little change in a message 2022-12-22 11:28:29 +01:00
pezcurrel
f8cdf2cf3b Changed check against “activity” values, which are strings, not integers 2022-12-22 07:40:41 +01:00
pezcurrel
c6c3feb500 Removed leftovers of “jsonwrite” option 2022-12-22 07:05:21 +01:00
pezcurrel
277296512c Explicitly set idn_to_ascii flags, otherwise with php 7.3 it complained 2022-12-21 22:15:40 +01:00