I just noticed that clean solution wouldn't delete the files under
obj/ hence the archive attached was quite large. This is the same
repro as DataImport2.zip but it's 8kb instead of 25mb:
https://dl.dropbox.com/u/6420016/DataImport2-small.zip
On Jul 29, 4:57 pm, Tobias Sebring <
tsebr...@gmail.com> wrote:
> I made the threading optional in the original repro controlled in by a
> boolean at the top of main() to show off the real code before I made it
> sequential:
> var runInParallel = false;
>
> Here's an updated repro with all the optional threading gone:
https://dl.dropbox.com/u/6420016/DataImport2.zip
>
> Note. that the ConcurrentDictionary is only ever accessed sequentially and
> I left it in there because it is one of the few things in the non-ravendb
> targeted code that will allocate a big chunk of memory.
>
>
>
>
>
>
>
> On Sunday, July 29, 2012 3:52:27 PM UTC+2, Oren Eini wrote:
>
> > I can't follow the code, please create a repro without all the threading
> > complexity there.
>
> >> "f93ht2b8is1usozq3nwqbc34ti1aln9fx5if5ra7u9mz444ktxpmc8bcg9xlaav5su7wfuukmz 6",
> >> "MediumText": "f93ht2b8is1usozq3nwqbc34ti1aln9fx5if5ra7u9mz444ktx",
> >> "ShortText": "f93ht2b8is1usozq3nwqbc34t",
> >> "NumberIntervals": [
> >> {
> >> "NumberFrom": 75,
> >> "NumberTo": 1985
> >> },
> >> {
> >> "NumberFrom": 705,
> >> "NumberTo": 1391
> >> },
> >> {
> >> "NumberFrom": 456,
> >> "NumberTo": 1471
> >> }
> >> ],
> >> "Type": "Type1",
> >> "Categories": [
> >> "Category3",
> >> "Category2",
> >> "Category4"
> >> ]
> >> }
>
> >> On Sunday, July 29, 2012 12:22:25 PM UTC+2, Oren Eini wrote:
>
> >>> Run this sequentially, without prallelism, first.
> >>> What is the size of the documents?
> >>> Can you create a repro?
>
> >>> On Sun, Jul 29, 2012 at 1:20 PM, Tobias Sebring <
tsebr...@gmail.com>wrote:
>
> >>>> Code with commented out lines:
> >>>> var bc = new BlockingCollection<**IndexedBatch<TData>>();
> >>>> var importTask = Task.Run(() =>
> >>>> {
> >>>> bc.GetConsumingEnumerable()
> >>>> .AsParallel()
> >>>> .WithExecutionMode(**ParallelExecutionMode.**ForceParallelism)
> >>>> .WithMergeOptions(**ParallelMergeOptions.**NotBuffered)
> >>>> .ForAll(data =>
> >>>> {
> >>>> var st = Stopwatch.StartNew();
> >>>> //using (var session = Store.OpenSession())
> >>>> //{
> >>>> foreach (var i in data.Batch)
> >>>> {
> >>>> //session.Store(i);
> >>>> }
> >>>> //session.SaveChanges();
> >>>> //}
>
> >>>> Console.WriteLine(@"Batch imported {0} in {1} ms", data.Index,
> >>>> st.ElapsedMilliseconds);
> >>>> });
> >>>> });
>
> >>>> Build is from NuGet a few days ago:
> >>>> <package id="RavenDB.Client" version="1.2.2044-Unstable" />
> >>>> <package id="RavenDB.Database" version="1.2.2044-Unstable" />
> >>>> <package id="RavenDB.Embedded" version="1.2.2044-Unstable" />
>
> >>>> Batch size is 1024 from recommendation I picked up here in the group.
> >>>> I'm running multiple import jobs concurrently but also tried limiting that
> >>>> with .WithDegreeOfParallelism(**1) and got the same result.
>
> >>>> On Sunday, July 29, 2012 7:41:11 AM UTC+2, Oren Eini wrote:
>
> >>>>> What lines did you comment?
> >>>>> What build are you using?
> >>>>> How many items are you using per SaveChanges call?
>
> >>>>> On Sun, Jul 29, 2012 at 4:44 AM, Tobias Sebring <
tsebr...@gmail.com>wrote:
>
> >>>>>> Got that fixed. Now I'm having trouble limiting the memory footprint
> >>>>>> of RavenDb. The memory consumption will gradually rise to 98% of physical
> >>>>>> ram at which point Windows 7 will start display warnings to close the
> >>>>>> program down and other applications will crash randomly.
>
> >>>>>> I've tried the following things to limit memory utilization in
> >>>>>> accordance with other threads in this group:
>
> >>>>>> I've turned off indexing:
> >>>>>> using (var webClient = new WebClient())
> >>>>>> {
> >>>>>> webClient.**UseDefaultCredential**s = true;
> >>>>>> var result = webClient.UploadString(new Uri(new Uri("
http://localhost
> >>>>>> :8080"), "/admin/stopindexing"), "POST", "");
> >>>>>> }
>
> >>>>>> Modified cache configuration settings (tried with different values -
> >>>>>> same result):
> >>>>>> <appSettings>
> >>>>>> <add key="Raven/**MemoryCacheLimitPer**centage" value="50" />
> >>>>>> <add key="Raven/**MemoryCacheLimitChe**ckInterval" value="00:00:15"
> >>>>>> />
> >>>>>> <add key="Raven/**MemoryCacheExpirati**on" value="60" />
> >>>>>> </appSettings>
>
> >>>>>> And disabled all caching:
> >>>>>> using (Store.DatabaseCommands.**Disabl**eAllCaching())
> >>>>>> {
> >>>>>>>> catch (Raven.Client.Exceptions.**NonUn********iqueObjectException)
> >>>>>>>> {
> >>>>>>>> session.Delete(i);
> >>>>>>>> }
> >>>>>>>> }
> >>>>>>>> session.SaveChanges();
> >>>>>>>> }
>
> >>>>>>>> On Friday, July 27, 2012 6:21:34 PM UTC+2, Oren Eini wrote:
>
> >>>>>>>> 1) Keep a side document with the mapping, so you can easily do a
> >>>>>>>> load by id, something like:
>
> >>>>>>>> "references/data/2012/07/27/js************9am2ms8la91" - {
> >>>>>>>> "DocId": "data/1"}
>
> >>>>>>>> 2) It isn't exposed to the client API.
>
> >>>>>>>> 3)
http://ravendb.net/docs/**server**********
> >>>>>>>> /administration/upgrade<
http://ravendb.net/docs/server/administration/upgrade>
>
> >>>>>>>> On Fri, Jul 27, 2012 at 7:16 PM, Tobias Sebring <
> >>>>>>>>
tsebr...@gmail.com> wrote:
>
> >>>>>>>> I'm using RavenDb to do bulk inserts from a large datadump similar
> >>>>>>>> to the process outlined by Ayende here: http<
http://ayende.com/blog/>