Fetching data from 50+ tables using Linq-to-Nhibernate

matsho

I have a web service that touches 50+ database tables (the database is heavily normalized) in order to create the response. The service returns all voyages modified within a date range specified by the client.

For performance reasons I want to avoid lazy-loading, fetching as much of the graph as possible before mapping to the response type.

I have broken my query down into smaller parts, using Nhibernate Fetch + ToFuture to eager load the data I need:

var fetchQuery = Session.Query<Voyage>()
.Fetch(v => v.VoyageStatus)
.FetchMany(v => v.VoyageLocations)
.Where(v => voyageIds.Contains(v.VoyageID))
.ToFuture();

Session.Query<Ship>()
.FetchMany(s => s.ShipCsos)
.Where(s => shipIds.Contains(s.ShipID))
.ToFuture();

Session.Query<Ship>()
.Fetch(s => s.ShipFlagCode)
.ThenFetch(sf => sf.Country)
.Fetch(s => s.ShipType)
.Fetch(s => s.ShipStatus)
.Fetch(s => s.ShipSource)
.Fetch(s => s.ShipHullType)
.Fetch(s => s.ShipLengthType)
.Fetch(s => s.ShipBreadthType)
.Fetch(s => s.ShipSpeedType)
.Fetch(s => s.ShipPowerType)
.FetchMany(s => s.ShipAttributes)
.ThenFetch(sa => sa.ShipAttributeName)
.Where(s => shipIds.Contains(s.ShipID))
.ToFuture();

//[Lots of similar Session.Query<X>...ToFuture() calls]

return fetchQuery.ToList();

Problem

I'm starting to hit the SQL Server parameter limit of 2100 when the date range reaches a certain span. I thought the parameter limit only applied to a single IN clause, but it apparently applies to a query as a whole; using Futures I end up with a single SQL query with one SELECT statement for each ToFuture call (each SELECT statement contains a moderately sized IN clause).

Is there a workaround for this? For instance, is there a way to send of smaller groups of Futures to stay within the parameter limit and still hydrate the entities?

I've tried doing a fetchQuery.ToList() call halfway through the Futures. This keeps the parameter limit exceptions at bay, but the entities are not hydrated properly according to Nhibernate Profiler (properties are lazy-loaded).

Any pointers would be much appreciated!

Frédéric

You may in fact have better keeping lazy-loading for performances reasons with NHibernate, even in your case.

(Wanting to switch to eager loading for performance reasons may be a sign of not knowing how to optimize lazy-loading with NHibernate. NHibernate can avoid the classical n+1 performance issue of lazy loads.)

Why lazy loading can perform well with NH

(Even in your case.)

Lazy-loading with NHibernate can be extremely performing. It tends to keep a good balance between runtime performances and development performances. Efficient to execute, and efficient to develop and maintain.

Adjust the lazy loading batch-size property on your entities and collections mappings.

(Linked reference give a detailed explanation of how it works.)

<class name="YourEntity" batch-size="20">
    ...
    <set name="SomeChildren" batch-size="15" ...>

Configuring that causes NHibernate to not only load related entities/collections when they are accessed, but also to include in the loading query up to batch-size - 1 related entities/collections it has tracked in its session first level cache. Of course, adjust batch-size values for matching your usual cases loading cardinalities.

This is a very powerful mechanism. It causes most of what would have been subsequent lazy-load calls to be already there, loaded from a single call to the DB, usable without additional round-trips to DB.

(Only in some extreme corner cases where the session is badly used, causing it to references many entities unrelated to your current work and having pending lazy-loads, lazy-loading batching can be badly defeated. This occurs since in such a situation, it may initialize too many pending lazy-loads unrelated to your work.)

You may globally configure a default batch size for all lazy loads of collections and entities with the global configuration parameter default_batch_fetch_size (to be put in hibernate.cfg.xml file, or to set through Configuration.SetProperty(Environment.DefaultBatchFetchSize, ...)).

With current NHibernate state, this incurs a memory cost, because NHibernate prepares the batching queries at session factory built. (An option may be ported from Hibernate for building those queries "on demand" rather than preparing them in advance, but this is not available yet. See #1316.)

Why eager loading could be a worst choice

In contrast, eager-loading can quickly incur "bloat code" and additional work for fine tuning and maintaining the required eager loads for each case. And failing to maintain them optimized surely lead to worse performances than lazy-loading with NHibernate. Even optimized eager loading may cause a lot more data than required to be loaded.

EF up to its 6 version was doing that. (I have not checked its Core version.) Its eager-loading querying strategy was causing eager-loaded result-sets to contain duplicated data, as soon as the "root" entities set where having many references to the same eager-loaded entity children instances. (And all this though in my current knowledge state, I tend to consider EF to be more usable than NHibernate about eager-loading. But that is quite a long time I have not considered and studied eager-loading with NHibernate, its lazy-loading being way more efficient than EF's one.)

(To be fair, lazy-loading may also cause some amount of bloat code: if the entities are to be used after closing the session (which is not recommended: consider using DTO/viewmodels/whatever), pending lazy loads may cause failures. To avoid them, NHibernateUtil.Initialize should be called on the entities lazy associations which are needed, before closing the session. And if you want to leverage async, you will also have to call the async version of Initialize before accessing the associations.)

Additional optimization available with lazy-loading

NHibernate features a build in support of second level caching. Second level caching allows caching data and sharing them among different NHibernate sessions.

With eager-loading, the second level cache can not be leveraged for loading your dependent entities from memory (in case you use a memory cache provider for second level cache). Second level cache is best exploited with lazy-loading.

It is a full featured data cache, handling invalidation of data automatically. (Provided you work with transactions. If deadlocks deter you from doing so, maybe should you consider to enable read committed snapshot mode on SQL Server, but this is a bit off-topic. Without explicit transactions, the cache will be disabled as soon as you start updating entities in your application.)

You only need to enable it in global configuration (cache.provider_class, cache.use_second_level_cache), and declare in your mapping what is cacheable (on entities and/or entity collections, with <cache usage="..." /> tag). Use cache regions for setting expiry. You may even cache queries (cache.use_query_cache, and specifying on queries if their are cacheable). See here for an example.

Of course, for your case, if your data is not eligible to caching, this feature is not useful. (It may be the case if other processes do update your data, while you do not wish to use and configure the SysCache2 provider which can get notified by sql server of any data changes.)

Side note

A well accepted solution to your trouble implies quite more work. Ideally your front application should work with a de-normalized copy of your data, easy and efficient to query, while your back-office keeps a normalized database.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Fetch data From Two tables using Linq

Custom Query for fetching data from multiple tables in spring Data Jpa

React : Fetching data from inputs using createRef

Fetching data from Api in Reactjs using Axios

Fetching data from facebook using graph api

Fetching data using NHibernate with Automapper

How to get data from additional tables using EF/LINQ

Fetching the value using Linq

Fetching Data from DB using PDO with Class

Issue with fetching data from excel (Using Java)

hibernate - fetching data from two tables using criteria

Sub Queries and fetching data from two tables

fetching data from phpmyadmin using php

Fetching data from server with promise using angularjs

Error while using mysql_num_rows() for fetching data from sql tables using php while using foreign key contraint

Fetching data from Two tables using Range

Fetching data from json using jsonArrayRequest

fetching data from the web page using DataFrame

How to encode JSON after fetching data from 2 tables using Mysql Php?

Fetching Data Tables

How to get rows of Data from Multiple Tables using LinQ To Entities

Return data from multiple tables using LINQ Method Syntax

Fetching data from msssql server using python

Facing issues while fetching data from two different mysql tables using a single query..!

Fetching data from three tables in mysql

Fetching data from three tables?

Fetching Data from API using UseEffect

NextJS fetching DATA from MongoDB using getServerSideProps

Fetching data from models using OneToOneField in Django