pezcurrel
|
6f296a1d27
|
Fixed writing langs ids to InstOurLangs; made detected languages fall back to declared languages when they can't be detected
|
2023-01-01 19:50:57 +01:00 |
|
pezcurrel
|
fa97078fe4
|
Passing calling __LINE__ through getlangsidsarr and getlangsid the right way
|
2023-01-01 19:26:39 +01:00 |
|
pezcurrel
|
87fbbbd2f9
|
Added validity check on declared and detected lang codes to getlangid function - when lang code is not valide it falls back to “en” default; added a getlangsidsarr function to cope with possible dupes and spare code
|
2023-01-01 19:21:21 +01:00 |
|
pezcurrel
|
c039d85d76
|
A little script to fix InstLangs, InstOurLangs and Languages table as of 2023-01-01
|
2023-01-01 18:23:34 +01:00 |
|
pezcurrel
|
24b9f9b911
|
Made it canonicalize lang codes in getlangid
|
2023-01-01 18:22:48 +01:00 |
|
pezcurrel
|
a4486403c7
|
Made language fetching normalize language code to a form with _ insted of -
|
2023-01-01 16:51:20 +01:00 |
|
pezcurrel
|
5ffd28f906
|
Added nl language
|
2023-01-01 11:14:25 +01:00 |
|
pezcurrel
|
0863c32703
|
Added InstChecks cleaning to “clean” action
|
2022-12-31 09:46:28 +01:00 |
|
pezcurrel
|
a4215c943d
|
Added a notify() to getlangid() to be done if Languages contains more than 1 record with given language code
|
2022-12-31 07:19:00 +01:00 |
|
pezcurrel
|
0c2f4e6b27
|
Added “intness” check on values from “tag trands history”
|
2022-12-31 07:13:02 +01:00 |
|
pezcurrel
|
b8e84c72c5
|
Fixed a misleading comment
|
2022-12-30 19:20:48 +01:00 |
|
pezcurrel
|
b40e474e76
|
Added code to set TotChecks and OkChecks
|
2022-12-30 18:12:07 +01:00 |
|
pezcurrel
|
b9392a0c25
|
Little script to set Instances.TotChecks and Instances.OkChecks according to records stored in InstChecks table
|
2022-12-30 17:16:11 +01:00 |
|
pezcurrel
|
f2d4119dfa
|
Fixed bug on domain block entity format check
|
2022-12-30 14:07:22 +01:00 |
|
pezcurrel
|
b39eafdb80
|
Fixed getlangid missing $opts; fixed array_pop on inexistent $languages var
|
2022-12-29 12:49:38 +01:00 |
|
pezcurrel
|
94bfec5f78
|
“$list” paramater of function “crawl” is now passed by reference; “$list” now gets unset in any case after it has been looped through; this hopefully will decrease a bit the amount of memory used by the script
|
2022-12-29 09:28:18 +01:00 |
|
pezcurrel
|
ec33912ff8
|
Restructured a bit the language management code
|
2022-12-28 23:47:23 +01:00 |
|
pezcurrel
|
db749d2e7d
|
Consider the possibility that “our languages” have been locked
|
2022-12-28 19:17:05 +01:00 |
|
pezcurrel
|
b929d06302
|
Don’t INSERT an instance if it did not respond: it was useful before instance “deadness” was autonomously managed by peerscrawl.php
|
2022-12-28 18:59:09 +01:00 |
|
pezcurrel
|
e8d588c0f2
|
Refactored language management
|
2022-12-28 18:34:57 +01:00 |
|
pezcurrel
|
2b37228a1c
|
Fixed some flaws in detecting if Thumb and AdmAvatar are to be set to “unavailable”; fixed “noindex” logic, now it also explictly set AdmAccount to special “OPTED OUT” value when noindex=true
|
2022-12-28 17:09:25 +01:00 |
|
pezcurrel
|
870e3524ca
|
Fixed $maxround not set to actual max round
|
2022-12-28 17:06:39 +01:00 |
|
pezcurrel
|
6f5de9730e
|
When server thumb or admin avatar are unavailable, set them to “unavailable”
|
2022-12-28 07:01:29 +01:00 |
|
pezcurrel
|
cf158ceb73
|
Now it makes a tar.xz with “run” directory contents and remove them before running commands
|
2022-12-28 05:25:56 +01:00 |
|
pezcurrel
|
ce66aa56e9
|
Added instance blocks support
|
2022-12-27 23:02:31 +01:00 |
|
pezcurrel
|
430c35e17a
|
Little change in progress string
|
2022-12-27 23:01:39 +01:00 |
|
pezcurrel
|
73469fa012
|
Default pool size back to 10
|
2022-12-27 15:39:46 +01:00 |
|
pezcurrel
|
c0b9c19469
|
Added keys check on instance info fetched from api v2 and v1; subordinated language checks to “$instaswered”; made get_toot_languages better cope with possible errors
|
2022-12-27 09:05:31 +01:00 |
|
pezcurrel
|
f7f1ac4cb2
|
logfile has .gii.log extension, handy to select only these files when run from crawler.php
|
2022-12-26 22:10:20 +01:00 |
|
pezcurrel
|
ea0118d445
|
cmd now prepends exec to command; pipes get closed
|
2022-12-26 22:09:18 +01:00 |
|
pezcurrel
|
463ef7cd37
|
Removed a dangling “}” which was breaking the script
|
2022-12-26 18:06:54 +01:00 |
|
pezcurrel
|
ff2d6c09a8
|
Removed “peers.all” file; made the script write to “peers files” only on exit (be it clean or by interruption)
|
2022-12-26 16:40:10 +01:00 |
|
pezcurrel
|
44b6456695
|
Made check on peers more strict, made the script abandon checking when a peers entry is malformed
|
2022-12-26 15:27:14 +01:00 |
|
pezcurrel
|
d301d25fcc
|
Removed witches.live, have to check why on the server it takes so long to check its peers, here it doesn’t, it’s probably just that we should increase the server ram and-or cpu
|
2022-12-26 15:02:50 +01:00 |
|
pezcurrel
|
3f7a5ff69c
|
Moved $graceline definition up, fixing a bug; made peers checking msgs more informative
|
2022-12-26 15:01:35 +01:00 |
|
pezcurrel
|
97f5b99654
|
Removed ckpeers function, it was overkill; added a preliminary check for “stringness” to the checks on each peer
|
2022-12-26 14:51:42 +01:00 |
|
pezcurrel
|
a4b2ae731c
|
Added witches.live since its peers list is huge and full of .activitypub-troll.cf
|
2022-12-26 14:50:26 +01:00 |
|
pezcurrel
|
926b1b0d73
|
Added ckpeers function to check if the json array returned by api/v1/instance/peers is well formed
|
2022-12-26 14:12:06 +01:00 |
|
pezcurrel
|
3c1621df1d
|
Added “LastOkCheckTS” to $instints (array of Instances columns of integer type)
|
2022-12-26 13:29:19 +01:00 |
|
pezcurrel
|
e820845775
|
Consider instances which have LastOkCheckTS=null but InsertTS>=$graceline as not dead, and to be checked
|
2022-12-26 13:28:09 +01:00 |
|
pezcurrel
|
61da12100f
|
Removed comment with Instances columns
|
2022-12-26 12:29:08 +01:00 |
|
pezcurrel
|
ebc458cc2c
|
Removed “resurrect” option and references to Instances.Dead
|
2022-12-26 12:28:21 +01:00 |
|
pezcurrel
|
1e1b2a99e9
|
Dropped Instances.Dead, using Instances.LastOkCheckTS now instead
|
2022-12-26 12:25:15 +01:00 |
|
pezcurrel
|
9b3cca9a45
|
Script to reasonably set Instances.LastOkCheckTS, first commit
|
2022-12-26 12:21:47 +01:00 |
|
pezcurrel
|
95b9ccfc31
|
Renamed “LastCheckOk” to “WasLastCheckOk”
|
2022-12-26 05:30:35 +01:00 |
|
pezcurrel
|
00caa1dcb9
|
Changed default for “deadline” option from 62 to 31 days
|
2022-12-26 05:17:59 +01:00 |
|
pezcurrel
|
44f437c928
|
Translated initial comment, made it more terse
|
2022-12-26 05:09:09 +01:00 |
|
pezcurrel
|
5312aea0cc
|
Added writing server rules in the db
|
2022-12-26 05:08:17 +01:00 |
|
pezcurrel
|
337eb32f51
|
Made mail optional, inactive by default
|
2022-12-25 23:45:31 +01:00 |
|
pezcurrel
|
429ab42ff5
|
Added 2 debug messages stating how mani dead instances the script got from Instances and Peers tables
|
2022-12-25 18:55:54 +01:00 |
|
pezcurrel
|
119c9119c2
|
Made the logic for “deadline” much more terse
|
2022-12-25 18:41:13 +01:00 |
|
pezcurrel
|
acde202b2e
|
Made it write summary from crawl run into a log file of its own
|
2022-12-25 18:40:21 +01:00 |
|
pezcurrel
|
ba171bd5f2
|
Use specifically bash since we use &> redirect
|
2022-12-25 11:43:30 +01:00 |
|
pezcurrel
|
ec9b65e42f
|
Little script to run peerscrawl.php in loop; first commit
|
2022-12-25 11:32:49 +01:00 |
|
pezcurrel
|
10e2e1b58a
|
Added “lecho” for “message levels”, removed “gecho”, removed “verbose” option; removed “loop” option (do loop from a shell script if needed)
|
2022-12-25 11:32:08 +01:00 |
|
pezcurrel
|
1d0c6b799a
|
Small edit to “logminmsglev” and “tuiminmsglev” TUI option parsing errors
|
2022-12-25 11:29:34 +01:00 |
|
pezcurrel
|
d95bc70b8a
|
Exposed “deadline” option; minor changes
|
2022-12-25 09:47:04 +01:00 |
|
pezcurrel
|
c0802de828
|
Removed “restore” option: could work, but it’s not very useful and would require a big hassle; added loops and new found instances counters; made sighandler use mexit
|
2022-12-25 09:24:23 +01:00 |
|
pezcurrel
|
d6b77b0e29
|
Removed option “-p peers” from crawler cmdline because now peerscrawl directly writes new instances into the db
|
2022-12-24 08:59:33 +01:00 |
|
pezcurrel
|
9fabb3853b
|
Infatti
|
2022-12-23 19:13:37 +01:00 |
|
pezcurrel
|
96aa6f3aa9
|
Quella roba lì
|
2022-12-23 19:12:18 +01:00 |
|
pezcurrel
|
05fed0142c
|
Lowered a bit default values for “timeout” and “curltimeout”
|
2022-12-23 11:23:32 +01:00 |
|
pezcurrel
|
89a2ea0b26
|
Fixed “trending tags” ordering and fetching
|
2022-12-23 11:22:25 +01:00 |
|
pezcurrel
|
edee66b834
|
Temporarily disabled “restore” option because it needs more work to actually work
|
2022-12-22 15:32:30 +01:00 |
|
pezcurrel
|
61d0fcb3d8
|
Added “loop” option allowing to run the crawl in an infinite loop or until sig(int|hup|term) received; other minor changes
|
2022-12-22 15:05:55 +01:00 |
|
pezcurrel
|
6477e8812f
|
Exposed “curltimeout” option; changed “timeout” default from 5 to 10; changed “curltimeout” default from 10 to 20
|
2022-12-22 14:24:48 +01:00 |
|
pezcurrel
|
6d4ce26f98
|
Adapted “restore” code to the new workings; minor changes and fixes
|
2022-12-22 14:04:29 +01:00 |
|
pezcurrel
|
c27053314a
|
Added code to store and consider “instance checks” made by the script to independently mark peers ad dead
|
2022-12-22 11:32:18 +01:00 |
|
pezcurrel
|
706c831e23
|
Little change in a message
|
2022-12-22 11:28:29 +01:00 |
|
pezcurrel
|
f8cdf2cf3b
|
Changed check against “activity” values, which are strings, not integers
|
2022-12-22 07:40:41 +01:00 |
|
pezcurrel
|
c6c3feb500
|
Removed leftovers of “jsonwrite” option
|
2022-12-22 07:05:21 +01:00 |
|
pezcurrel
|
277296512c
|
Explicitly set idn_to_ascii flags, otherwise with php 7.3 it complained
|
2022-12-21 22:15:40 +01:00 |
|
pezcurrel
|
9316e686b9
|
Bir rewrite, made it shorter and hopefully a bit more readable
|
2022-12-21 22:07:05 +01:00 |
|
pezcurrel
|
732ea79480
|
Moved $mastodons definition upper
|
2022-12-21 22:06:10 +01:00 |
|
pezcurrel
|
1c524ffd69
|
Moved mysqli_close after the optional loading of dead instances from the db; renamed $eta to $tet
|
2022-12-21 22:05:15 +01:00 |
|
pezcurrel
|
0d74dbf243
|
Got rid of akeavinn; other minor changes
|
2022-12-21 07:54:11 +01:00 |
|
pezcurrel
|
2c86580bfb
|
The regexp to decide whether an instance is Mastodon or not is now based on the Platforms table; made ckratelimit more precise about possible missing headers; added code to set Version from /api/v1/instance when it was not already set from nodeinfo
|
2022-12-21 06:53:31 +01:00 |
|
pezcurrel
|
4fdf287686
|
Little change in delete prompt
|
2022-12-20 23:02:53 +01:00 |
|
pezcurrel
|
d8f15f4b3a
|
Does delete from UserFields much faster
|
2022-12-20 23:01:13 +01:00 |
|
pezcurrel
|
f6dc080ed6
|
Major rewrite: it was a mess, now it is less :-)
|
2022-12-20 23:00:22 +01:00 |
|
pezcurrel
|
f30bcc7e5e
|
Updated to new delinstbyuid function form
|
2022-12-18 18:43:09 +01:00 |
|
pezcurrel
|
e744ae9d20
|
Fixed validhostname; added and using myq function for mysql queries; changed some exit codes; made connection error management better
|
2022-12-18 18:42:11 +01:00 |
|
pezcurrel
|
f29f636a70
|
Added mysql error code to error message that is echoed if a query fails
|
2022-12-18 18:40:22 +01:00 |
|
pezcurrel
|
cd16eabce5
|
Using new form of delinstbyuid function; removed references to New column
|
2022-12-18 18:27:22 +01:00 |
|
pezcurrel
|
835e02c171
|
Removed code referencing New, Good and Chosen columns
|
2022-12-18 18:26:03 +01:00 |
|
pezcurrel
|
f4aa3cb804
|
A script to set InsertTS field to something suitable when it is null
|
2022-12-18 18:24:35 +01:00 |
|
pezcurrel
|
c5debcb463
|
A function to delete an Instance record by ID, and all references to it in other tables; first commit
|
2022-12-18 18:23:52 +01:00 |
|
pezcurrel
|
ccc9f517fd
|
Moved code to delete an Instances record and all its references in other tables to function delinstbyid in lib/delinstbyid.php; minor changes
|
2022-12-18 11:44:11 +01:00 |
|
pezcurrel
|
2b0e2398ae
|
Made validhostname accept only valid hostnames :-)) (no ports or path specs)
|
2022-12-18 11:42:32 +01:00 |
|
pezcurrel
|
32251d1ba8
|
Added “deleteinstswhere” action
|
2022-12-18 11:41:09 +01:00 |
|
pezcurrel
|
7d2875075b
|
Deleted
|
2022-12-18 09:35:24 +01:00 |
|
pezcurrel
|
690b54521b
|
Moved
|
2022-12-18 09:35:04 +01:00 |
|
pezcurrel
|
f269bb901d
|
Little cosmetic change
|
2022-12-18 07:00:49 +01:00 |
|
pezcurrel
|
e9b88d6735
|
Made $jsonfp be written into run dir
|
2022-12-18 07:00:19 +01:00 |
|
pezcurrel
|
a3ada274e7
|
Removed stdout/err redirect in cmd; passing proper descriptor and pipe to proc_open; minor changes
|
2022-12-18 06:59:25 +01:00 |
|
pezcurrel
|
a32a25e095
|
Many many changes :-))
|
2022-12-18 00:34:27 +01:00 |
|
pezcurrel
|
d6dd03694c
|
Removed $context
|
2022-12-17 22:54:32 +01:00 |
|
pezcurrel
|
441d16a42d
|
ckratelimit goes to sleep only when x-ratelimit-remaining==0; can spit debug info; limit fetching chunks from users directories is now 40
|
2022-12-17 22:54:02 +01:00 |
|
pezcurrel
|
ca4367b719
|
Removed the unlinking attempt at lockfp before exit: it was already done before by shutdown; other little changes about open files closing and the like
|
2022-12-17 18:43:13 +01:00 |
|
pezcurrel
|
e5ad18e619
|
Fixed a typo
|
2022-12-17 18:40:55 +01:00 |
|