Skip to main content

Posts

Showing posts from October, 2024

Wow. That took a while!

So I've managed, in a little over two days, to complete the archiving of a single issue of the Bar Hill News. 40 pages. Suffice to say if it takes this long every time there's a good chance the "time left" to archive the 600+ issues will increase on a month-on-month-basis as I add new issues! It was interesting going through a magazine I designed MYSELF in order to see what I had forgotten about the layout as I was writing the descriptive language necessary in order to get this to all work. The first thing to say is that it took a LONG time to get started. I created a Numbers spreadsheet with the approximate layout - two columns 20 slots. For most pages this would be more than adequate. In fact it worked pretty well having this as the "default" for all the pages. From memory there was precisely two pages that caused me to have to deviate from the standard.  The first thing was on the front of the cover page to the right it. You see all the boxes at the Botto...

And So It Begins ...

I don't think I've ever get tired of using that Babylon 5 quote ... but I digress. Fundamentally, though, I'm going to start with the latest issue of the Bar Hill News . October 2024 Issue 661. It's relatively easy for me to get hold of it - I'm the editor - and the copyright issues are absolutely minimal.

What Format to Use and Where to Store the Maps?

Having been through ZXDB (on GitHub), I am in two minds about starting with a database. Now, don't get me wrong. I will install Postgres locally on my Mac and create a database, and everything will go in there but it is worth taking one step up and writing the output data in a JSON format in a one-file-per-magazine structure. The advantage I see in this is that once I have the file generated, I can import it into Postgres and go from there. Should someone else pick a different database - then they will have the base files to work with as well.  Like ZXDB I have created a GitHub project for this, and, because I should never be allowed to name anything, I have chosen the name "Total Magazine Archive Project" or TMAP for short. See? I told you I shouldn't name things. The GitHub link is here  (the project is public). So, at least one decision has been made: GitHub will be used to store the maps. Regarding the files themselves, I'm looking at JSON. For example; {   ...

Building a Map for Individual Pages

I spent a lot of time thinking about how to do this, it was clear that there needed to be some kind of pretty flexible language for writing the document and that millimeter-perfect accuracy was not something to aim for - it would increase the complexity too much which in turn would increase the time necessary to complete the archive. So I started work looking at two pages from the Micro User. It quickly became apparent that the Bar Hill News just wasn't a good place it start, for a start it is A5 and therefore has significantly less complexity than a large magazine (or a newspaper). A model that can handle more complexity can always be used for less complex material, but a model designed for less complex material will fail when provided with material at the more complex end of the spectrum. To the right is the first of the example pages. The first thing to note is the page is split into five columns with a page header spanning across all five, and then containing three articles (...

Magazine Archiving

As you can probably tell if you've checked out my Internet Archive uploads the vast majority of my uploads are magazines. The problem is, even with the best OCR in the world, searching for a specific item is impossible at worst and very unreliable at best. It can be very frustrating knowing something is there and not being able to find it.  BBC Micro User (Hover for Attribution) This, it seems, is not a unique problem to me. One of my other interests is Retro Gaming - specifically the BBC B microcomputer from the early 80's. This was my first computer and I look back on it very fondly. At the time I was using it, I used to have a subscription to Acorn User , and Micro User magazines. This suffers from the same issue. Despite some efforts to use OCR to read the text from articles, etc, there just doesn't seem to be a way to - for example - search for all the Repton game reviews, find an individual picture, get a list of software titles advertised in the magazine, etc.  Fo...

What's this? An archiving blog?

 It is! Well spotted. The purpose of this blog is to allow me to record some of the archiving work I do both locally, in my village, and for retro-computer systems in general (although with an Acorn/BBC slant). If you're looking to find my uploads then the site I upload the most to is the Internet Archive . Here are a couple of examples; Pace Micro Technology - Micronet Terminal for the BBC Micro Computer Bar Hill News 200204 April 2002 Issue 386 Scanning-wise I tend to use an  HP OfficeJet Pro 9120e Series for A4 documents (of which there are a lot!), photos, and other less than A4 or hand-written documents. This works quite well and also works in a very automated way. I can often just set it going and do other things. Of course, there will always be things that aren't suitable for feeding through a scanner. In this case, I have the Scanner Pro app on my iPhone. I have used this for years and find it incredibly useful for scanning books and odd-shaped items.  And final...