Batch Size Setting

Mar 16, 2011 at 5:04 PM

First - thanks to everyone contributing to this component.

Is there a place to adjust the batch size setting? I seem to remember this being mentioned in the MapPoint geocoder and i was wondering if there is anything equaling that this version. 

Coordinator
Mar 16, 2011 at 5:29 PM

The batch size settings can be found in the data flow task properties, in the "Misc." category. That is, they are in the data flow container in which the batch geocoder component resides, not in the geocoder itself. There are two settings:

  • DefaultBufferMaxRows (initial value is 10,000)
  • DefaultBufferSize (initial value is 10,485,760)

Since these properties reside outside the custom geocoder pipeline component, it is up to whoever (or whatever) creates the data flow task to set them.  The Bing Maps restrictions are

  • No more than 300 Mb per batch, uncompressed (i.e. compressing it doesn't help)
  • No more than 200,000 locations

I have not attempted to measure the relationship between the incoming buffer size and the outgoing geocode XML request size. To exceed the 300MB Bing Maps limit the data would have to expand by a factor of 30 (30 x 10 Mb = 300 Mb). I suppose that that is possible, although it doesn't seem likely. I don't know off-hand what would be the case for the MapPoint geocoder -- I didn't work with that one.

If you happen to run into the 200,000 location limit, you can probably allow for that in your data flow source. The queries I use typically geocode only locations that have not already been geocoded, and limiting the number of locations per batch should be simply a matter of including a TOP(200000) -- or less -- clause. Again, the numbers might be different for the MapPoint geocoder.

Mar 16, 2011 at 5:40 PM

Thank you, i would not have looked there!

Jun 29, 2011 at 3:19 PM

how long does this usually take to send the following locations, what should the turn around time be?

 10,000  --

 50,000  --

100,000 -- 

200,000 --

thanks

John

 

Coordinator
Jun 29, 2011 at 5:29 PM

I haven't benchmarked Bing Maps to see how long it takes for large batches. It can be pretty slow even on small batches, although I am using a free (non-profit organization) account, and I don't know if that makes a difference. The SSIS component submits the request and then periodically polls the Bing website until it receives a response indicating that the job is finished. Potentially, that could amount to a very long time. For small batches, the polling interval itself may be represent a significant portion of the total delay.

The SSIS Batch Geocoder is a tool for connecting with the Bing Maps Batch Geocoder flow. The details of the latter should be documented by Bing, but I have not turned up a lot of information so far. I have provided links to the parts of the documentation that I know about.

Jul 19, 2013 at 6:00 PM
60000 (Russian locations) 31m
110000 (US) 80m
172000 (US) 114m

Keep in mind that times vary. I have seen a lot of variance and the overall time probably depends on Bing's request load at the time you submitted.