Quantcast
Channel: Gwulo: Old Hong Kong
Viewing all articles
Browse latest Browse all 427

The Jurors List for 1941

$
0
0

As I've just posted the 1941 Jurors List online, now's a good chance to talk about using the Jurors Lists for research. This list is also a bit different as I've used OCR (optical character recognition) to convert the scanned document into text, so I'll give some tips if you're thinking about using it. Then finally some thoughts on sharing this type of work.

1. What you can find in a Jurors List?

This list is a snapshot of Hong Kong in early 1941. Let's see what it can tell us about a couple of the authors of our wartime diaries.

1.1 Barbara Anslow

Unfortunately Barbara definitely won't be listed, as women couldn't serve as jurors in Hong Kong until after WW2.

Her father, Mr Redwood, was in Hong Kong at the time, working in the Navy Dockyard. And the two Mr. Anslow's, her future husband and father-in-law, were in Hong Kong too. But as they worked in the armed services or civil service, they couldn't be jurors either.

So, a good example of the limitations of the Jurors List, but we'll have much better luck with our next author.

1.2 Paul Atroshenko

He was just 4 at the time, so far too young to be a Juror. We can find his father though:

Atroshenko, JohnOverseer, Marsman, Hong Kong (China), Ld.5 Ashley Road, Kowloon.

That's helpful. Paul remembered his father had worked for Marsden, but we can let him know it was actually Marsman.

Then in an email Paul asked if we could tell him anything about the Smirnoff sisters:

I can't be certain, but I think that the three girls who my brother and I visited during that first American bombing raid may have been the Smirnoff sisters. Perhaps you can ask for info on this in your blog. Someone might remember. The block of flats they lived in was on a road running parallel to Nathan Road and to the East of it. I know that it wasn't too far North of the Peninsula Hotel. It didn't take us long to get home, which was then close to the Star Ferry terminal.

There's an entry for Smirnoff in the Juror's list too:

Smirnoff, George VitalievitchArchitect, Marsman, H. K. China Ld.14 Hart Avenue, Kowloon.

We can see two connections between the fathers, a similar heritage and the same employer. The addresses fit too, with Hart Avenue on the East of Nathan Road, and not too far from Ashley Road.

So the obvious information from a single list includes full names, title & employer, and an address. (Note that sometimes the address shows where the person lived, and sometimes where they worked.)

With a bit more investigation, and comparison of lists over several years you can see how businesses rise and fall, new residential areas appear, and the changing nature of work.

2. Turning to text: type / talk / OCR

Scanned copies of the jurors lists are already very useful, but looking through them for an address, say, is slow and error-prone. It's much quicker to find information if we can search through the text in our browser. eg we saw that Mr Atroshenko and Mr Smirnoff both worked for Marsman, and we already know that Marsman had been busy excavating the tunnels for the wartime Air Raid Shelters. How many people in the 1941 list were working for Marsman?

I use the web browser to search for "Marsman" and it instantly shows 42 matches. Looking for them by hand will take a couple of minutes, with a good chance of missing one or two.

So if text is better, how can we convert the scanned copies to text ?

2.1 Type them up

This is the simplest approach - look at the scanned copy and type the text into your computer.

It's usually a slow process, but with the Jurors Lists we can use a trick to speed things up. Each year we start off with last year's list and update it. Since many people on last year's list are still there this year, it saves a lot of typing. Working like this it only takes 20-30 minutes to finish typing a page.

2.2 Dictate them

I've used the Dragon speech-recognition software. I find it works well on general text, eg a newspaper article, but that with lots of names I have to spell them out. This makes it a lot slower, slower than typing.

2.3 OCR them

OCR, or Optical Character Recognition sounds great. Special software reads in the scanned pages, and converts them to text for you.

As Dragon had worked well I started off trying the Omnipage OCR software from the same company. I was disappointed with its accuracy so tried Abbyy Finereader instead. It was a lot more accurate and I've used it since, including converting all 80+ pages of the 1941 list to text. 

If you give it a scan of a modern document, it just takes a few seconds and returns almost 100% accurate text. But with old documents like the Jurors Lists you'll still have to check and correct its mistakes where it can't recognise text that isn't clear.

Still, it learns and improves over time, so by the end of the list I was only spending around 20-25 minutes per page. 

2.4 Summary

For Jurors Lists, the current method where we type a page, but use the previous year's list as the starting point, is still my preferred solution. Anyone can join in without needing to buy any software. That's a big plus as we can share the workload. Converting the 1941 list took me around 35 hours, so any help is very welcome!

But if you're converting a lot of unique scanned documents where you can't make use of similar, earlier versions, or if the scanned copies are very clear, OCR is a big time-saver.

3. Type what you'll never read

Finally, how can we get people to join in and help us type up more pages from the Jurors Lists? I'm still struggling to find a good way to explain why it's a good thing to type a page you'll never read!

We know that the typed version is much much faster to work with than the scanned version. And if we got everyone who'd spent half an hour looking at scanned Jurors Lists to type a page, we'd have the lot done in no time and we'd all benefit.

But when it comes down to actually typing, you're typing a page you'll probably never use yourself, and that feels like a waste of time.

A conundrum!

Regards, David

PS You can see the 1941 list online at http://gwulo.com/jurors-list-1941, and if you'd like to have a go at typing a page you can find the instructions at http://gwulo.com/current-j-list


Viewing all articles
Browse latest Browse all 427

Trending Articles