OK – as some of you will know, over the last few days many people have joined together and provided the hands and effort to solve CAPTCHAS on the CEAC system. Xarthisius coded a BRILLIANT tool that was able to harness that manpower and capture the CEAC data – case by case. That data provides an invaluable insight into what happened in DV2017. Knowing what happened, allows us to make better predictions about the next year of DV, but also allows general better understanding of the process.
So – I want to explain some of the findings and over the coming days I can elaborate on what it means for DV2018.
Firstly, even though DV2017 is over, the CEAC data is still being updated. Some of the embassies are slow to update CEAC, and some cases will not show the accurate final position. So – the CEAC data isn’t “precise”, and requires a little interpretation. As an example – take 2017AF62. A few days ago that case was showing as AP for the principal and ready for the derivatives. Now the case shows Issued and the 3 derivatives show ready. The issued update happened on September 27th – and I would not be surprised to see the derivatives get updated in the next few days. Because of this continuing updates, we ideally want to continue the scraping. This will improve the accuracy of our data overall – although the numbers we already have can be considered highly representative – only a few hundred changes at most.
Next, let’s discuss what is interesting about the data. Well the first thing is issued counts of course – and I will give those here. But we also can learn from the AP and refused cases. We can analyze that data to find which embassies are “strict” and so on. I last did that with 2014 data and published the league table. I will update that in the next couple of days.
AP cases tell us a lot about which embassies tend to lean toward AP. That isn’t always for one embassy though – for instance we know that Iranian cases have a very high chance of AP – but those cases are processed at three embassies.
So – for the issued cases (and remember, these numbers could change over the next few days as CEAC gets updated), here are the counts per region. These numbers DO NOT include adjustment of status – which normally adds another 1500 cases or so globally.
AF – 18981
AS – 7441
EU – 19919
OC – 696
SA – 1700
Total – 48737 (Remember AOS will be up to 1500 on top of this number)
So – we can see that they have indeed hit the 50,000. Remember, a few more cases will be added to the issued count for AF and AS region in particular since these were the earliest regions to be scraped.
Now – some explanations of factors that affect the DV2018 process. In every year, there are a few factors that need to be understood.
I have explained before about “Holes” (shown by density charts). Simply, holes are cases disqualified before the results are announced because the case broke some rule OR because a country was “limited” during the draw. I have explained this in detail here and also here and I have produced density charts previously with full explanation here. So – now we have the density charts for DV2017
In the chart below we can see that the density for the numbers up to CN15000 is an average of 76%. That means 24% of all Africa cases are holes – disqualified for double entries etc before the May announcement. Hole cases get picked along with genuine cases in the initial selection but are pulled out of the final selection and never get notified of the selection.
Between CN15000 and CN16000 there is an obvious drop in density. The rate drops to 58.7%. To explain why that happens you should make sure you read the holes theory explanations linked above, but for illustration I am also including a second chart for Accra Embassy alone. That chart clearly illustrates the absence of cases after 16000. We have to remember that a foreigner (non Ghanaian) interviewing in Accra would show up on the second chart as this data gives selectees by embassy, not country of chargeability. So – that explains the 4 individual cases above 16000.
From the second chart we can also see the rate at which Ghana is getting selectees. However, that does not include “non responses” (explained later in this article). However it clearly shows the compression of Ghana cases in the CNs up to 16000. With a little more complicated logic I am able to calculate the rough numbers of entries that a country had, and the non response rate for the country.
In a normal year, other countries get limited too. Ethiopia, Egypt, Congo DRC all have the potential to be limited. Using the charts I can see Egypt was limited at around 32000 in DV2017, but other countries have a less obvious cutoff. In DV2015 and DV2014 this cutoffs were more obvious because the selectee counts were higher – and DV2018 will be more like those years.
So – comparing Asia to Africa region we can see the initial holes rate is lower (6.5% for Asia, 24% for AF). That, simply put, means AS region entries more typically follow the rules. That density rate continues until Iran and Nepal hit their selectee limits at around 8100 or 8200. I did hear of some second draw selectees from Nepal at around 8600, but I assume those were statistical outliers.
Interestingly, Nepal actually seems to cutoff in my charts by 7600. That number is about where the 1st draw was limited for Nepal, and I know from non response analysis that there is a noticeable spike in non responses in AS region after 7600. My assumption is that the agents in Nepal (they are “good” agents, not the bad type) probably told people to not submit their DS260 for the second draw selectees, OR many selectees in Nepal simply didn’t know about their win. I’d like to hear from high case number selectees from Nepal to confirm those assumptions.
Initial holes rate is 20.5%, again, much higher than Asia region.
In EU region the only country limits were on Ukraine and Uzbekistan (affectionately know as the U2 countries). Both countries hit their limits pretty early, one just after 15000 and the other around 17000.
OC and SA region do not have limited countries, so their charts are less “interesting” in a sense.
OC holes rate is 8%
SA region holes rate is about 6%
There are cases that stay on the status “at NVC”. These cases are non responses – people who did not submit their DS260, or submitted it so late that the DS260 could not be processed in time. I have published an article about that before – here. There are also some cases where the selectee is doing adjustment of status, but AOS cases are only about 3 – 4%.
Non responses are interesting because they help explain why KCC select so many additional selectees. So – if a region has 30000 selectees as published in May, but 25% are not going to respond, then there are “only” 22500 selectees left. So – non response rate is highly significant to predict chances for a region of going current.
We have to be a little wary of using the starting non response rate across the whole region, because the NR rate can vary by country. It makes sense that the U2 countries for example would have a high NR rate because the agencies there are registering people speculatively – later approaching the “winners” with the offer of the Green Card win for a few thousand dollars of bribe money.
AF chart shows significant non responses starting at around 28%, and rising with high case numbers. Again – this is high case numbers holding back assuming they would not get an interview.
The AS region demonstrates the spike I mentioned for late cases non submitting. The initial non response rate is 27%. I am surprised it is so high.
The EU non response chart shows the late spike in non responses and 34% initial rate.
OC region non response rate is startling at about 50%, although OC also has a high AOS rate (because of the easy access for Australians to non immigrant working visas)
SA region also has a high non response rate (49%), but again, has a fairly mobile population, so might have higher AOS.
OK – I have to stop at this point to spend the day with my wife as it is her birthday! But hopefully you will see the incredible wealth of information that this data provides. I am so incredibly grateful to Xarthisius for understanding the vision for the crowddriven scraper and being able to make it work, and to all the people out there who made it happen. Let’s carry on scraping for a few days to get the update the database with the final DV2017 numbers, and then we can start again in 2018. I don’t expect to have any DV2018 data in the CEAC system until January 1 2018. Once it is there we can harness the incredible power of the wonderful readers of this blog!