Commit graph

118 commits

Author SHA1 Message Date
pezcurrel
690b54521b Moved 2022-12-18 09:35:04 +01:00
pezcurrel
f269bb901d Little cosmetic change 2022-12-18 07:00:49 +01:00
pezcurrel
e9b88d6735 Made $jsonfp be written into run dir 2022-12-18 07:00:19 +01:00
pezcurrel
a3ada274e7 Removed stdout/err redirect in cmd; passing proper descriptor and pipe to proc_open; minor changes 2022-12-18 06:59:25 +01:00
pezcurrel
a32a25e095 Many many changes :-)) 2022-12-18 00:34:27 +01:00
pezcurrel
d6dd03694c Removed $context 2022-12-17 22:54:32 +01:00
pezcurrel
441d16a42d ckratelimit goes to sleep only when x-ratelimit-remaining==0; can spit debug info; limit fetching chunks from users directories is now 40 2022-12-17 22:54:02 +01:00
pezcurrel
ca4367b719 Removed the unlinking attempt at lockfp before exit: it was already done before by shutdown; other little changes about open files closing and the like 2022-12-17 18:43:13 +01:00
pezcurrel
e5ad18e619 Fixed a typo 2022-12-17 18:40:55 +01:00
pezcurrel
5c605cbe5b Some little cosmetic (readability of log files) changes 2022-12-17 18:40:22 +01:00
pezcurrel
2571396253 Tuned to recent changes in crawler.php (and getinstinfo.php) 2022-12-17 17:36:46 +01:00
pezcurrel
ad8fa26306 Made mysql connection and charset setting errors more graceful; added “users” page to updstats; other minor changes 2022-12-17 17:35:35 +01:00
pezcurrel
2d1d28b002 Fixed regexp checking if max_charcters is an integer; made mexit use eecho again, moving the closing of logf after eecho(s); made logf be opened only if logminmsglev < 4 2022-12-17 17:33:46 +01:00
pezcurrel
d1f088a026 Command for subprocesses gets now built on the fly using cmd function; logfile doesn’t get opened if logminmsglev < 4; other minor changes 2022-12-17 17:31:24 +01:00
pezcurrel
6d897cfdff Removed “crawlernew” directory 2022-12-17 15:03:11 +01:00
pezcurrel
c7d5b50377 Adapted to new crawler version 2022-12-17 15:02:52 +01:00
pezcurrel
0b9e892aef Splitted old crawler.php in 2; this is the part that coordinates 2022-12-17 15:02:20 +01:00
pezcurrel
7629a1caae Moved from subdir “crawlernew” 2022-12-17 15:00:36 +01:00
pezcurrel
b46469bfbb Cope with mysql errors even with php ver. < 8; check if $link is false before trying to close mysql connection in function mexit 2022-12-16 22:39:51 +01:00
pezcurrel
e46a82d923 Added suffix “s” to option “-t” in $cmd definition; cope with mysql errors even with php ver. < 8; other small changes 2022-12-16 22:38:16 +01:00
pezcurrel
3804171253 Renamed getfc to gurl 2022-12-16 21:59:26 +01:00
pezcurrel
d75a6445ae Fixed sleep and relative message to actually use $options['udirfailst']; changed some options’ default; changed a bit the help text; changed a bit some messages in the “fetchusers” section 2022-12-16 21:23:14 +01:00
pezcurrel
b666fcfdda Use parsetime for “--timeout” too, changed a bit the help text accordingly 2022-12-16 21:19:34 +01:00
pezcurrel
2d31b7ca79 Changed a bit how option “jsonwrite” works; other minor changes 2022-12-16 19:25:45 +01:00
pezcurrel
f193bd294c A very tiny db test, first commit 2022-12-16 19:12:17 +01:00
pezcurrel
9529a938a7 Lots of changes, not very important :-)) 2022-12-16 19:06:47 +01:00
pezcurrel
bab5bd5dd2 Added mysqli_query error management for older php versions to function myq; minor changes 2022-12-16 19:05:46 +01:00
pezcurrel
4522bc3ea8 Added mysqli_query error management for older php versions to function myq 2022-12-16 19:02:41 +01:00
pezcurrel
513981b7e2 Fetches info from an instance, first commit 2022-12-16 00:00:37 +01:00
pezcurrel
1cafbe05ea Crawler new version, “multithreaded”, coordinator script, first commit 2022-12-16 00:00:06 +01:00
pezcurrel
3a720a90ac Added a warning when nodeinfo specs couldn’t be fetched; made it set New=1 even when host doesn’t respond and is not in the db 2022-12-15 12:45:20 +01:00
pezcurrel
9360bdc481 Made false positives for “IsMastodon” less likely (impossible?) 2022-12-12 22:40:17 +01:00
pezcurrel
483fbcd103 Added the possibility, for each set of records with the same URI, to choose one record to keep and delete the others, or to automatically keep the record with the lowest ID and delete the others 2022-12-12 21:32:35 +01:00
pezcurrel
90e85f8182 Took away “-t 10” option from crawler.php calls since 10 is now its default timeout 2022-12-12 17:06:51 +01:00
pezcurrel
e07fba673d Fixed call to non-existent function ”mysq” to “myq” 2022-12-12 08:36:18 +01:00
pezcurrel
ec6324fb4f Made langs() shorten to a maximum of 5 elements the $languages array 2022-12-12 08:29:18 +01:00
pezcurrel
d80ba5ddc4 Fixed double “,” in a query inside langs() 2022-12-12 08:24:26 +01:00
pezcurrel
6f9260e08e myq() did not return results, now it does 2022-12-12 08:17:01 +01:00
pezcurrel
f6752a34bc Added function “myq” as a wrapper for mysqli_query managing exceptions; used it throughout the whole script 2022-12-12 08:12:29 +01:00
pezcurrel
2649e7d137 Info from nodeinfo didn’t end up into $info, now they do 2022-12-12 00:47:06 +01:00
pezcurrel
783e54a9f9 Little script to convert a unix timestamp in input to a date 2022-12-11 23:33:09 +01:00
pezcurrel
1701a2cfe6 Little script to search for records with the same “URI” in Instances table 2022-12-11 23:32:42 +01:00
pezcurrel
853be9f0e0 Little script to fix uris beginning with “https://” 2022-12-11 23:31:47 +01:00
pezcurrel
b16515f4e8 Lots of changes :-)) 2022-12-11 23:29:51 +01:00
pezcurrel
61ad655a62 Disabled fetching profile’s page when “noindex” is not set in account because it takes too long; disabled featured tags fetching fro the same reason; other minor changes 2022-12-10 23:32:58 +01:00
pezcurrel
f343cb702e Changed some eecho messages importance 2022-12-10 13:57:30 +01:00
pezcurrel
4b7f6a199c Added truncs where needed; added code to check for “noindex” on user’s profile page when “noindex” is not set in accounts info 2022-12-10 12:35:22 +01:00
pezcurrel
18ce06871b Added ckratelimit() where useful; made it more flexible with lowercasing every header key; more work on fetching users from users directories 2022-12-09 22:53:18 +01:00
pezcurrel
8341f0e209 Fixed a cosmetic bug; some more work into users directories fetching 2022-12-09 19:25:44 +01:00
pezcurrel
ffd20debe6 Removed executable attribute 2022-12-08 14:35:55 +01:00