/t/ - Technology

Discussion of Technology

Index Catalog Archive Bottom Refresh
Mode: Reply
Options
Subject
Message

Max message length: 8000

Files

Max file size: 32.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and postings)

Misc

Remember to follow the rules

The backup domain is located at 8chan.se. .cc is a third fallback. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 2.0.

Please be aware of the Site Fallback Plan!
In case outages in Eastern Europe affect site availability, we will work to restore service as quickly as possible.

WE ARE PLANNING THE 2.8 SITE UPGRADE FOR THIS WEEKEND, MONDAY EVENING 6-27. DOWNTIME WILL BE BRIEF, AND THEN WE HUNT FOR BUGS.
(Estamos planeando la actualización del sitio 2.8 para este fin de semana, del lunes 6 al 27 por la tarde o por la noche en CST. El tiempo de inactividad será breve y luego buscaremos errores.)


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

Board Nuking Issue should be resolved. Apologies for any missing posts.

(4.11 KB 300x100 simplebanner.png)

Hydrus Network General #4 Anonymous Board volunteer 04/16/2022 (Sat) 17:14:57 No. 8151
This is a thread for releases, bug reports, and other discussion for the hydrus network software. The hydrus network client is an application written for Anon and other internet-fluent media nerds who have large image/swf/webm collections. It browses with tags instead of folders, a little like a booru on your desktop. Advanced users can share tags and files anonymously through custom servers that any user may run. Everything is free, privacy is the first concern, and the source code is included with the release. Releases are available for Windows, Linux, and macOS. I am the hydrus developer. I am continually working on the software and try to put out a new release every Wednesday by 8pm EST. Past hydrus imageboard discussion, and these generals as they hit the post limit, are being archived at >>>/hydrus/ . If you would like to learn more, please check out the extensive help and getting started guide here: https://hydrusnetwork.github.io/hydrus/
https://www.youtube.com/watch?v=AQOfIENN2tk windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v488d/Hydrus.Network.488d.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v488d/Hydrus.Network.488d.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v488d/Hydrus.Network.488d.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v488d/Hydrus.Network.488d.-.Linux.-.Executable.tar.gz I had a good simple week making a clean release before my vacation. Everything is misc this week, nothing earth-shattering, just a bunch of cleanup and little stuff. If you have any wavpack files, try importing them! full list - the client now supports 'wavpack' files. these are basically a kind of compressed wav. mpv seems to play them fine too! - added a new file maintenance action, 'if file is missing, note it in log', which records the metadata about missing files to the database directory but makes no other action - the 'file is missing/incorrect' file maintenance jobs now also export the files' tags to the database directory, to further help identify them - simplified the logic behind the 'remove files if they are trashed' option. it should fire off more reliably now, even if you have a weird multiple-domain location for the current page, and still not fire if you are actually looking at the trash - if you paste an URL into the normal 'urls' downloader page, and it already has that URL and the URL has status 'failed', that existing URL will now be tried again. let's see how this works IRL, maybe it needs an option, maybe this feels natural when it comes up - the default bandwidth rules are boosted. the client is more efficient these days and doesn't need so many forced breaks on big import lists, and the internet has generally moved on. thanks to the users who helped talk out what the new limits should aim at. if you are an existing user, you can change your current defaults under _network->data->review bandwidth usage and edit rules_--there's even a button to revert your defaults 'back' to these new rules - now like all its neighbours, the cog icon on the duplicate right-side hover no longer annoyingly steals keyboard focus on a click. - did some code and logic cleanup around 'delete files', particularly to improve repository update deletes now we have multiple local file services, and in planning for future maintenance in this area - all the 'yes yes no' dialogs--the ones with multiple yes options--are moved to the newer panel system and will render their size and layout a bit more uniformly - may have fixed an issue with a very slow to boot client trying to politely wait on the thumbnail cache before it instantiates - misc UI text rewording and layout flag fixes - fixed some jank formatting on database migration help next week I am now off for a week. I think I need it! I'm going to play a ton of vidya, shitpost the big streams that are happening, fit some Wagner in, and get on top of outstanding IRL stuff. I'll be back to catch up my messages on Saturday the 18th. Thanks everyone!
trying to use Hydrus for the first time; is there a way to add subscription for videos specifically? So that it leaves out photos?
(480.53 KB 640x360 shitposting.gif)

>>8675 Have a nice vacation OP and watch out for fucking normies.
id:6549088 from gelbooru. (nsfw) with download dec. bomb deactivated. When downloading this specific picture, before it finishes downloading, it makes the program jump to 3 gb of ram until i close it. Is opens normally with browser, but spikes to 3 gb on hydrus. and since i only have 4 gb it makes the pc freeze. Just wanted to report on that. Also, no native enflish speaker here.
>>8679 forgot, using version 474
>>8668 Reporting in that v488 seems to have fixed both these bugs. There's no longer the thumbnail exception being logged, the startup time to get to a UI window is quicker, and the parent-sync status un-stuck itself. Hooray!
>>8645 This is about what I figured. I pulled the database from a dying hard drive a few months ago. Every integrity scan between now and then ran clean, but I had a suspicion something had gotten fucked up somewhere along the line. Since it's been a minute, any backups are either also corrupted, or too old to be useful. Luckily, re-constructing them hasn't been too painful. I made an "unknown tag:*anything*" search page, then right-click->search individual tags to see what's in them. Most have enough files in to give context to what it used to be, so I'll just replace it. It's been a good excuse to go through old files, clean up inconsistent tags, set new and better parent/sibling relationships, etc, so it's actualy been quite pleasing to my autisms. I had 80k files in with an unknown tag back when I started cleaning up, and now I'm down to just under 40k. I'm sure I've lost some artist/title tags from images with deleted sources, or old filenames, but all in all, it could be much worse.
Thanks man! Have a good vacation!
>>8676 if you're just subscribing to a booru, they will generally have a "video" tag. you can add "video" to the tag search.
>>8703 nope, not a booru. So there isn't a way to filter that. awh.
Is there any way to get Hydrus to automatically tag images with the tags present in the metadata? Specifically the tags metadata field, why whole collection was downloaded using Grabber.
>>8710 my*
>>8709 What website is it? You might be able to add to/alter the parser to spit out the file type by reading the json or file ending, then use a whitelist to only get certain file endings (i.e. videos)
I've been using hydrus for a while now and is in the process of importing all my files. Is there any downside to checking the "add filename? [namespace]" button while importing? Think i got over 300k images so it would create a lot of unique tags if that would be a problem.
About how long do you estimate it might take before hydrus will be able to support any files. I specifically need plaintext files and html files (odd, I know) if that makes a difference. The main thing is just that it'd be nice for me to have all my files together in hydrus instead of needing to keep my html and (especially) my text files separate from the pics and vids. Also. I'm curious. Why can't hydrus simply "support" all filetypes, by just having an "open externally" button for files that it doesn't have a viewer for? It already does that for things like flash files, afterall.
>>8627 >>8636 >>8646 It seems to be working now, not sure what changed but somehow arch doesn't always mount the samba directory anymore and needs a manual command on boot now, which it didnt before. Maybe it was some hiccup, maybe some package I happened to install as I installed more crap, maybe it was a samba bug that got updated.
Is there a way to reset the file history graph, under Help?
>>8668 >>8681 Great, thanks for letting me know! >>8671 Thank you. The modified date for that direct file was this: Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT I thought my 'this is a stupid date m8' check would catch this, but obviously not, so I will check it! Sorry for the trouble. I'll have ways to inspect and fix these numbers better in future. >>8674 I'm sorry to say I don't understand this: >"import these files and use the filename as the original file hash, then set imported as better and delete the other if imported is smaller" But if you mean broadly that you want some better metadata algebra for mass actions, I do hope to have more of this in future. In terms of copying metadata from one thing to another, I just need to clean up and unify and update the code. It is all a hellish mess from my original write of the duplicates system years ago, and it needs work to both function better and be easier to use
>>8676 >>8703 >>8709 >>8716 In the nearish future, I will add a filetype filter to 'file import options', just like Import Folders have, so you'll be able to do this. Sorry for the trouble here, this will be better in a bit! >>8679 >>8680 I'm sorry, are you sure you have the right id there? gif of the frog girl from boku no hero academia? I don't have any trouble importing or viewing this file, and by it looks it doesn't seem too bloated, although it is a 30MB gif, so I think your memory spike was something else that happened at the same time as (and probably blocked) the import. Normally, decompression bombs are png files, stuff like 12,000x18,000 patreon rewards and similar. I have had several reports of users with gigantic memory spikes recently, particularly related to looking at images in the media viewer. I am investigating this. Can you try importing/opening that file again in your client and let me know if the memory spike is repeatable? If not, please let me know if you still get memory spikes at other times, and more broadly, if future updates help the situation. Actually, now I think of it, if you were on 474, I may have fixed your gigantic memory issue in a recent update. I did some work on more cleanly flushing some database journal data, which was causing memory bloat a bit like you saw here, so please update and then let me know if you still get the problem. >>8688 Good luck!
>>8710 Not yet. I don't inspect EXIF much yet, but I expect some sort of retroactive parser in future. Or I wouldn't be surprised if a user figures out a Client API tool to do this. Unless you mean NTFS tags, in which case I am even less expert. I know there are some tools that can convert NTFS tags into xml files, and I know a user once did that and then munged those files into .txt files for tag import, but I've never done of that stuff myself. >>8721 If you do this, make a new tag service for your filename tags under services->manage services->add->local tag service. Call it 'filenames' or something. The downside is these tags are messy. 300k tags won't add much lag, maybe 0.5-2% slower file load kind of thing. But they will get in the way, and most users find they don't actually want them all that often. Putting them in another service puts them in a little box on their own where it is easier to hide, compartmentalise, and potentially delete them in future without affecting your 'real' search tags. >>8722 Not sure. It is number 6 on the 'big stuff' list here: https://github.com/hydrusnetwork/hydrus/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions-%2B1-desc 'Support more filetypes / arbitrary file import ', so I see it happening, and very likely within the next three years. I also want to store my .txt and .html files. I have thousands of fanfics from a sordid past life of mine that I want to categorise. The problem I need to overcome is that hydrus is currently predicated on the ability to infer a filetype based on file content alone. The reasons for this are some bullshit technical stuff mostly related to maintenance and weird downloads, but it is currently needed. If you toss a file called 'file.file' at hydrus, it needs to be able to figure out if it is a jpeg or mp4 just looking at its insides. Most media files have rigid formats, literally a few bytes at like 'offset 8 bytes, WEBP', that make it easy to recognise them very quickly. Text and HTML have very dynamic content, so figuring out what they are is more tricky. Before I allow all files, I may be able to straight up support text and html, but there will still be problems. HTML is doable since you basically run it through a parser and see if it raises an error, but then you have to determine if it was HTML or XML. I expect to start work on this soon since some formats (SVG and some other open-source image-editing formats) are just XML, so I'll start recognising the broad category of XML and then try recognising keywords or these little XML 'this is what I am' tags, I think they are called DTD or something, and we may just fall into HTML support by happy accident. Raw .txt is much more difficult. A one-byte text file with 'a' is a text as much as a book in japanese unicode is. I probably can't recognise that versus any other arbitrary format, although I can probably get a high confidence guess. Supporting arbitrary files will require some import and maintenance rejigging. I'll have to no longer know that I can always figure out the mime of a file, and I'll have to pass the mime along from the original file extension or whatever. ALSO there are secondary issues like, at the moment, if the hydrus downloader runs into an HTML file when it expected a jpeg (e.g. some kind of fucked up 404 message that gave 200 instead of 404, which happens sometimes), I raise an 'ignored' error and say 'I think this downloader needs to be taught how to parse this document'. But when we can support HTML, what do I do then? Do I import the HTML error page as a file? I'll have to do something to the import workflow in general to say when text/html is ok and not ok. I'm leaning towards allowing text files on hard drive import, and then disallowing it on downloader import unless the URL Class specifically specifies it, but how the hell I make that user-friendly I'm not sure yet. Anyway, sorry, I went on a bit there, but that's the basic background. It will come, but it will be a big job, so I need to clear out some other things first. I'm basically done with multiple local file services now, so I'm moving on to some server updates and janny workflow improvements for my next big job. We'll see if that takes me all the rest of this year, but I hope I can clear it out faster, and then move on to the next thing.
>>8723 Great, let me know how things go in future! >>8725 What part would you like to 'reset'? All the data it presents is built on real-world stuff in your client, like actual import and archive times. Do you want to change your import times, or maybe clear out your deleted file record?
I had a good week. I did a mix of cleanup and improvements to UI and an important bug fix for users who have had trouble syncing to the PTR. The release should be as normal tomorrow.
when trying to do a file relationship search, is there a way to search for same quality duplicates. I don't see any way to do that, and every time I look at the relationships of a file manually, it's always a better/worse pair. Does Hydrus just randomly assign one of the files as being better when you say that they're the same quality?
https://www.youtube.com/watch?v=6rboksqjPy4 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v489/Hydrus.Network.489.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v489/Hydrus.Network.489.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v489/Hydrus.Network.489.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v489/Hydrus.Network.489.-.Linux.-.Executable.tar.gz I had a good week getting back into the swing of things. I fixed some important bugs and improved some UI. highlights All the downloader pages--gallery, watcher, urls, and simple--have a revamped status system. All the text that shows how file or gallery downloads are going is now generated in a better way, with more error states (e.g. it will tell you when your gallery stopped because it hit the file limit, or when one of the emergency pause states under the network menu has kicked in), and logic in edge cases is improved. Everything is unified now, so the texts are the same across all pages. Also, if a gallery query or watched thread is 'pending', its text now reports that it is waiting for a work slot, rather than staying blank. There _shouldn't_ be any situations now where a downloader is unpaused with work to do but has blank status. If you use the multiple local file services system, the archive/delete filter now presents more options when you are done. If the files are in more than one local file service, you can choose where you delete them from, including all applicable. This was confusing and opaque before, so I hope this makes it more clear what is happening and gives you more choice. I _believe_ I have fixed an important bug some users were having with PTR processing. There was an annoying issue about a 'definitions' file being seen as a 'content' file, or vice versa, that the automatic maintenance could not fix. I finally managed to reproduce the issue and fixed it. I schedule a fix in the update this week, so if you have been hit by this, please wait for one more round of file maintenance 'metadata' scans, and then unpause the PTR one more time. Essentially, I think I fixed the automatic maintenance. Let me know how you get on! full list - downloader pages: - greatly improved the status reporting for downloader pages. the way the little text updates on your file and gallery progress are generated and presented is overhauled, and tests are unified across the different downloader pages. you now get specific texts on all possible reasons the queue cannot currently process, such as the emergency pause states under the _network_ menu or specific info like hitting the file limit, and all the code involved here is much cleaner - the 'working/pending' status, when you have a whole bunch of galleries or watchers wanting to run at the same time, is now calculated more reliably, and the UI will report 'waiting for a work slot' on pending jobs. no more blank pending! - when you pause mid-job, the 'pausing - status' text is generated is a little neater too - with luck, we'll also have fewer examples of 64KB of 503 error html spamming the UI - any critical unhandled errors during importing proper now stop that queue until a client restart and make an appropriate status text and popup (in some situations, they previously could spam every thirty seconds) - the simple downloader and urls downloader now support the 'delay work until later' error system. actual UI for status reporting on these downloaders remains limited, however - a bunch of misc downloader page cleanup - . - archive/delete: - the final 'commit/forget/back' confirmation dialog on the archive/delete filter now lists all the possible local file domains you could delete from with separate file counts and 'commit' buttons, including 'all my files' if there are multiple, defaulting to the parent page's location at the top of the list. this let's you do a 'yes, purge all these from everywhere' delete or a 'no, just from here' delete as needed and generally makes what is going on more visible - fixed archive/delete commit for users with the 'archived file delete lock' turned on - . - misc: - fixed a bug in the parsing sanity check that makes sure bad 'last modified' timestamps are not added. some ~1970-01-01 results were slipping through. on update, all modified dates within a week of this epoch will be retroactively removed - the 'connection' panel in the options now lets you configure how many times a network request can retry connections and requests. the logic behind these values is improved, too--network jobs now count connection and request errors separately - optimised the master tag update routine when you petition tags - the Client API help for /add_tags/add_tags now clarifies that deleting a tag that does not exist _will_ make a change--it makes a deletion record - thanks to a user, the 'getting started with files' help has had a pass - I looked into memory bloat some users are seeing after media viewer use, but I couldn't reproduce it locally. I am now making a plan to finally integrate a memory profiler and add some memory debug UI so we can better see what is going on when a couple gigs suddenly appear - . - important repository processing fixes: - I've been trying to chase down a persistent processing bug some users got, where no matter what resyncs or checks they do, a content update seems to be cast as a definition update. fingers crossed, I have finally fixed it this week. it turns out there was a bug near my 'is this a definition or a content update?' check that is used for auto-repair maintenance here (long story short, ffmpeg was false-positive discovering mpegs in json). whatever the case, I have scheduled all users for a repository update file metadata check, so with luck anyone with a bad record will be fixed automatically in the background within a few hours of background work. anyone who encounters this problem in future should be fixed by the automatic repair too. thank you very much to the patient users who sent in reports about this and worked with me to figure this out. please try processing again, and let me know if you still have any issues - I also cleaned some of the maintenance code, and made it more aggressive, so 'do a full metadata resync' is now be even more uncompromising - also, the repository updates file service gets a bit of cleanup. it seems some ghost files have snuck in there over time, and today their records are corrected. the bug that let this happen in the first place is also fixed - there remains an issue where some users' clients have tried to hit the PTR with 404ing update file hashes. I am still investigating this next week I ended up doing more cleanup this week than I expected, but I'm happy to have the downloader pages reporting better. They were a real knot before. I want to spend a little admin time next week, triaging final multiple local file services work and planning future server improvements for when that is done, and then I think I'd like to focus on more small jobs, including some github issues.
>>8743 Yes, 'same quality' actually chooses the current file to be the better, just as if you clicked 'this is better', but with a different set of merge options. The first version of the duplicate system supported multiple true 'these are the same' relationships, but it was incredibly complicated to maintain and didn't lend itself to real world workflows, so in the end I reinvented the system to have a single 'king' that stands atop a blob of duplicates. I have some diagrams here: https://hydrusnetwork.github.io/hydrus/duplicates.html#duplicates_advanced I don't really like having the 'this is the same' ending up being a soft 'this is better', but I think it is an ok compromise for what we actually want, which is broadly to figure out the best of a group of files. If they are the same quality, then it doesn't ultimately matter much which is promoted to king, since they are the same. I may revisit this topic in a future iteration of duplicates, but I'm not sure what I really want beyond much better relationship visibility, so you can see how files are related to each other and navigate those relationships quickly. Can you say more why you wanted to see the same quality duplicate in this situation? Hearing that user story can help me plan workflows in future.
(426.25 KB 958x538 64b0.png)

(8.03 KB 503x125 ClipboardImage.png)

>>8151 What do I do for this? I'm just tyring to have my folder of 9,215 images tagged.
What installer does Hydrus use? I'm trying to set up an easy updating script with Chocolatey (since whoever maintains the winget repo is retarded).
>>8755 Figured it out, Github artifacts shows InnoSetup. Too bad Chocolatey's docs are half fucking fake and they don't do shit unless you give them money. This command might work, but choco's --install-arguments command doesn't work like the fuckwads claim it does. choco upgrade hydrus-network --ia='"/DIR=C:\x\Hydrus Network"'
(11.25 KB 644x68 ClipboardImage.png)

>>8756 No, actually, that command doesn't work, because the people behind chocolatey are lying fucking hoebags. Seeing this horseshit, after THEY THEMSELVES purposfully obfuscated this bullshit is FUCKING INFURIATING.
>>8745 The main thing I wanted to do is compare the number of files that were marked as a lower-quality duplicates across files from different url domains with files that aren't lower-quality duplicates (either kings, or alts, or no relationships) to see which domains tend to give me the highest ratio of files that end up being deleted later as bad-dupes, and which ones give me the lowest, so I know which ones I should be more adamant about downloading from, and which ones I should be more hesitant about. This doesn't really work that well if same-quality duplicates can also be considered "bad dupes" by hydrus, because that means I'm getting a bunch of files in the search that shouldn't be there, since they're not actually worse duplicates, but same-quality duplicates that hydrus just treats as worse arbitrarily. Basically, I was trying to create a ranking of sites that tend to give me the highest percentage of low-quality dupes and ones that give me the lowest. I can't do that if the information that hydrus has about file relationship is inaccurate though. It's also a bit confusing when I manually look at a file's relationships, because I always delete worse duplicates, but then I saw many files that are considered worse duplicates and I thought to myself "did I forget to delete it that time". Now this makes sense, but it still feels wrong to me somehow.
(2.77 KB 306x117 windozeerror.png)

>>8757 >2022 and still using windoze Time to dump the enemy' backdoor.
>>8753 The good catch-all solution here is to hit up services->review services and click 'refresh account' on the repository page. That forces all current errors to clear out and tries to do a basic network resync immediately. Assuming your internet connection and the server are ok again, it'll fix itself and you can upload again. >>8755 >>8756 >>8757 Yeah, Inno. There's some /silent or something commands I know you can give the installer to do it quietly, and in fact that's one reason the installer now defaults to not checking the 'open client' box on the last page, so some automatic installer a guy was making can work in the background. I'm afraid I am no expert in it though. If I can help you here, let me know what I can do. >>8758 Ah, yeah, sorry--there's no real detailed log kept or data structure made of your precise decisions. If you do always delete worse duplicates though, then I think you can get an analogue for this data you want. Any time you have a duplicate that is still in 'my files', you know that was set as 'same quality', since it wasn't deleted. Any time a duplicate is deleted, you know you set it as 'worse'. If you did something like: 'sort by modified time' (maybe a creator tag to reduce the number of results) system:file relationships: > 0 dupe relationships then you switch between 'my files' and 'all known files' (you need help->advanced mode on to see this), you'll see the local 'worse' (you set same quality) vs also the non-local worse (you set worse-and-delete), and see the difference. In future, btw, I'd like to have thumbnails know more about their duplicates so we can finally have 'sort files by duplicate status' and group them together a bit better in large file count pages. If you are trying to do this using manual database access in SQLite and want en masse statistical results, let me know. The database structure for this is a pain in the ass, and figuring out how to join it to my files vs all known files would be difficult going in blind.
>>8759 >Unironically being that guy Buddy, you just replied to a reply about easier updating with something that would make it ten times harder. Not to mention that hilariously dated meme. >>8760 Yeah, Choco passes /verysilent IIRC, and /DIR would work, but Powershell's quote parsing is fucking indecipherable, Choco's documention on the matter is outright wrong, and I can't 'sudo' in cmd. I'm considering writing a script to just produce update PRs for the Winget repo myself, since it's starting to seem like that would be easier, but I don't want to go through all of Github's API shit.
Pyside is nearly PyPy compatible (see https://bugreports.qt.io/browse/PYSIDE-535). What work would need to be done in Hydrus to support running under PyPy?


Forms
Delete
Report
Quick Reply