First of all I want to say Happy New Year to all of you. I wish you health and happiness and I hope this year is the year your dreams come true.

As I announced in this earlier post, the CEAC data because available at midnight on January 1st. It is really important that as many people as possible take a few minutes a day to solve some CAPTCHAS to help scrape the CEAC data. Over 4000 people have looked at my announcement, and nearly 300 people have tried out the system to solve at least 1 CAPTCHA. BUT we need more people to share the workload! Right now members Simba, Abdikarim, Guyanese_Spice, ¥εηƴ have each solved over 2000 CAPTCHAS each – BIG thank you to them! I know solving CAPTCHAS is harder on some phones, so some won’t be able to help much, but I do think we can make the scraping go faster and easier if we can share the workload among more readers!

A BIG thank you to those members that have already solved some CAPTCHAS. One member was smart enough to mention how many CAPTCHAS he had solved when he asked a question this morning, and I was especially careful to give him a full and detailed answer to his question. Those who put effort in for the community as a whole can rightfully expect my full attention!

Now then – some people might now know why this data is important. So – let me explain what it is, how much data we have so far and what we can figure out with the data.

CEAC is a database that contains the status of every case for DV lottery. The earlier aticle explained how you can check the status of your own case. It only gives high level status. It cannot tell you whether your DS260 is processed, it cannot tell you when your AP will finish, it cannot tell you your actual interview date. It is just a high level status. It shows the status of the case (waiting to be scheduled, ready for interview, issued, refused, AP etc). Once the case is scheduled, it tells us the embassy the case is assigned to and the number of derivatives on each case. We can also figure out the holes, which is what we refer to as density (how many cases are missing from each 100 case numbers). Density decreases in AF, EU and AS regions, which is why VB progress can accelerate later in the year. Being able to predict that density drop is helpful to predict VB progress.  So – it is very useful data.

So far we have solved about 35000 CAPTCHAS, (each captcha allowing us to check one case number). Xarthisius setup the software to go after the current case ranges first and then go back and find the rest of the case numbers, right up to the highest case number in each region. Right now the scraper has already got all the cases that have been scheduled up to the latest VB – so 14300 for AF region, 10700 for EU and so on. That has allowed me to total the totals for each region which I will show below. Currently the system is going after the higher case numbers in each region. We probably have around 65000 more numbers to check. Once we have checked ALL the case numbers we will then be able to see the density changes throughout the whole number range and perform some extensive analysis. However, some interesting details are already being revealed. More on that later in this article.

So – Here are the current totals as shown by the data.

Please note that because of the way CEAC is updated by the embassy, there are some cases where AP, REFUSED and READY numbers are not accurate. For instance. If a principal selectee is refused, the derivatives on that case CANNOT be approved. However, some derivatives might remain in READY or AP status because the embassy doesn’t mark all derivatives as refused. I have a way to calculate those cases but for now, these numbers are a good enough representation to show the progress.  However, the embassies seem to be more diligent with Issued status – so that number is more reliable. We can see that nearly 9000 visas have been issued so far – that being after 3 months of interviews. Bear in mind that many of the AP cases will be successful so we will see that issued number rising faster in later months. That is normal.


Now – I also mentioned there was an interesting finding in the data so far. This concerns EU region. As of now I can only see EU numbers up to about 12000. But there is a very obvious drop in density that happened just after 10700. That means at least one of the countries (probably Uzbekistan or Ukraine) met the selection limit by that point. In other words for that country (and I don’t know which one it is) ALL their cases were given case numbers of below 10700. Each of those two countries had nearly 4500 selectees (including derivatives) so that means one of those countries took about 2800 to 3000 cases in the first 10700 case numbers. To understand why that happens please read my old post that explains the draw process.

Now, what does this mean for VB progress. Well, in a post I wrote a couple of months ago (actually written on October 12) I explained how VB progress happens in some detail, using EU as an example. At that time I mentioned we would see increases of about 2500 per month for EU until March or April. In fact, the reduced density will mean we will start to see slightly faster VB progress. Probably about 3500 for the next VB for EU. I had predicted we would see about 16000 by April interviews, and in fact we should be nicely above that by April. These are rough estimates, not accurate predictions, but once I start seeing more complete data I will be able to refine the guesses. I cannot do the same for other regions yet because I need the rest of the data. So please – don’t bother asking me – just get solving CAPTCHAS!! 🙂