A lot of my understanding of the DV lottery process was built or enhanced by having the ability to analyze the data from the CEAC system. DHS makes this data publicly available in a website where anyone can check the status of any case number. Until April of 2016 I had a script that could check those numbers one by one and store the results to a file. I published that file a couple of times a month so people could see various data such as the number of visas issued, case refused, people on AP and so on. It also allowed me to predict Visa Bulletin numbers with surprisingly good accuracy.
In April of 2016, the website that provided that information was modified slightly to include a “CAPTCHA” challenge where you have to enter some characters for each case as a way of proving you are a human – not a robot/script. I talked about this more in this post. I also suggested a possible solution using many people to solve the captchas and asking for developers to help.
Not having the CEAC data is tremendously restricting. It’s like we are blinded or in a dark room. We can’t see progress, we can’t predict what will happen or understand what has happened. I would like to extract the 2017 data to understand what has happened – and then when the 2018 CEAC data is available (January 1 most likely) I would like to be able to get that data to better answer your questions. So – if you are sick of seeing me write “wait and see” – we need the CEAC data.
I was contacted by a couple of developers over the last year or so. People were keen to help – and some progress was made – one developer (thanks Matias!) put some effort into programmatically solving the captchas – but that is extremely difficult to do.
Now a smart guy (Xarthisius) has developed a working solution. We need a few people to help test it over the next week or two in a limited wso we can make sure the system works can handle the workload. In a test, Xarthisius was able to extract around 1000 cases and I have validated the extracted results as accurate.
The task would be to log in to a page where you enter your name and (optionally) your email address. We would like to know who is performing the work so we can “thank” people for their effort.
Then you see a page where the Captcha is shown and you enter the details to solve the captcha. Behind the scenes, each time you solve a captcha we can extract a case. It takes a few seconds per captcha – so if people can give us a few minutes spent on a fairly “boring” activity, we can perhaps turn the lights back on.
Xarthisius and I would like to run a test with about 20 to 40 people. If you are interested in helping by giving a few minutes time over the next week or so, email me at britsimon3 at gmail. I will then give you a web address and you will be able to perform the test.
September 29, 2017 at 09:54
This is my email [email protected] I want to be part of the test
September 29, 2017 at 03:17
This might be illegal.
https://en.wikipedia.org/wiki/Goatse_Security#AT.26T.2FiPad_email_address_leak
https://en.wikipedia.org/wiki/Computer_Fraud_and_Abuse_Act
You should probably talk to an attorney before continuing.
September 29, 2017 at 03:27
No it is not illegal!!
The data is in the public domain and DHS provide the following copyright information.
“Copyright
Links to Department sites are welcomed. Unless a copyright is indicated, information on the Department of State Web Site is in the public domain and may be copied and distributed without permission. Citation of the U.S. State Department as source of the information is appreciated.”
September 28, 2017 at 21:11
Email sent Brit. I am willing to test the new captcha.
September 28, 2017 at 18:51
I have send you an email, i wont to be part of tests