Project Review time (Application No No's)

In my previous post I talked about the database, and some of the hardships we encountered during this project. Now I would like to go through the application layer and describe some of the critical areas which hurt us druring and after the development.




Web UI Architecture


I think there is a beautie in simplisity which is offten mistaken for lack of experience of expertise. I think this was the key driver for much of the core designs used in this application. Our initial requirement document stated, the site must be built using web parts, and it described in great detail how each of the web parts interact with each other. However, no thought was given to how this would be used by the end users, how was this going to solve the business problem, how was it going to be utilised to benefit the end user? Only once most of the application was built was a thought given to the end users, and how this architecture would work. It became clear the solution was Technology driven rather then business problem driven, and it did nothing to satisfy the end users needs.

Delviang deeper in the detail of the design, we used Microsoft web parts, with asp.net using anthem update panels. This seems like it would work, but delving deeping in the design, a few things were overlooked which killed performance and tied hands during development.


Entity Framework 3.5 was used, and DTO's were not. which meant the entire object graph was sent back to the UI instead of a trageted set of information. Because the database was so complex, the entities returned were massive. To top that off, viewstate was used to store the entity graph between posts. This meant, each time the web part needed to refresh, the entire object graph would need to be serialise and deserialised. The viewstate was stored in the database so each request would populate a massive set of entities, then get serialised back into the database. The web parts were hooked up to events, which were triggered after something happened to allow a decoupling effect. However, I did not really see the gain of having a search control decoupled from its results list, as there was no real need to reuse these web parts anywhere else, and so we had complexity and no return. What ended up happening, was 1 page was created with up to 10 web parts. So instead of having 1 page for each set of related functionality, the 1 page contained maybe 3 pages worth of functionality.
Let me explain, usually, a page would be set up to load a search criteria area, when the user click search a list of results is displayed. Then if the user clicks select on 1 of the rows, a new page is presented with the information selected. This page may have some more related information which a user can click to get to. Once at the desired location users can edit some of the information, and only this information is modified, and the request can be redirected to the desired location. This technique, not only splits out the logic of each of the pages, but also helps by minimising the amount of work the application must do between posts.

In the case of web parts, a user clicks search and a result is displayed... this is where it gets "funny". When the user clicks select, the search needs to run again to retrieve the result lists (we could not store the results list because it was too big for the viewstate and took too long to serialise and deserialise) the selected entity is put into viewstate, and an event is called passing the entity to be used in another web part (on the same page).

The second reason this did not work, was when we did a search for some information, we not only needed to retrieve the list of results, but also all the data we are going to need for all the other possible webparts that could be used on the page. So in trying to de-couple the web parts, we are actually indirectly tightly coupling them through the underlying data.

In the first case, we would navigate around 2 or 3 pages. in this case, all the information is being displayed on 1 screen. When you click save 1 piece of information is saved, but then each and every web part need to refresh what is being displayed. We did not spend enough time thinking about the overall design and it definatly did not work well when it was all put together.



The Wonderful, Wonderful land of JQuery
Jquery is wonderful tool, but in the wrong hands it can cause a world or hurt. When ajax was becomming popular I started review various javascript frameworks (back before jQuery). I opened a few tabs to these modern (web 2.0 we called it back then) web pages, and well, my pc didnt really cope too well. all that client side processing killed my machine.

In the same way we abused jQuery, and added another 3 second lag to every request. and that is before we even get anywhere near the back end processing. The main problems here was the ajax callbacks did not play nice with the document.ready used in jQuery, and so in response, 1 method was created where everything for every page was loaded, and this was called each time an ajax postback was handled. I recently worked on another project which used jquery and ajax post backs and found an AjaxReady extention which works great. (AjaxReady)

ORM Attached Vs Detached Model
One of the very first problems you face when using something like EntityFramework and web applications, is how to get what the user entered into an attached entity object, so it can be saved to the database. This seems like a trivial problem, but in the fairly new world of M$ ORM its a problem which re-occures time and time again. Best practice is the use a data context within a using statement which means the objects you get back are detached. This means, the objects can be modified without fear of updating unexpectantly to the database.



The problem we had was that we started out this way, and then changed our tackt half way through to an attached model, where instead of making a request which was unattached to the database, we actually made all the requests attached all the time, and upon a user submitting a form, first thing was to attache this to the context and then process. It was deemed quicker to do it this way as the back end processing needed a lot of additional data to complete. THIS WAS A NIGHTMARE TO WORK WITH. (capitilisation here is for effect, if HTML still supported blink, I would make it blick too).


Not only was it problamatic due to all the code working one way, and then needing to work the total opposite way, but also because every update was potentially being pushed to the database weather you wanted to of not. The application in question did a lot of bulk processing, so when we process and save 10,000 records and there is 1500 phantom records added, it was incredibly difficult to find where these 1500 records were added.


DTO vs Entities
When starting out a new project, there has to be a descision on basic achitecture upfront. to DTO or not to DTO. On the one hard, they limit data exchange, and on the other they have a performance hit and can tightly couple the UI to the service layer. After this project I am on the fence about DTO's, I think there is a time and a place to use both. This project would definatley have benifitted from DTO's. The database complexity meant the object graph was so big, that each time we sent down an entity to be serialise in viewstate, we had a massive performance hit. It also meant that any processing using the many related objects was fraught with error due to the complexity of the usage. So instead of using a nice flat structure like dto.property = blah. we might need to do something like entity.firstchild.secondchid.where(p=>p.property=something).first.thirdchild.intersect(someotherentity.firstproperty.where(p=>p.property.secondproperty=something)...... and it goes on. All this complexity could have been handled in the translation layer, and a nice easy to use DTO returned. Instead, this code was everywhere, and sometimes different in different places. And so, many nice unique issues arose.

Storing Transient values
In every web application we need to store something between posts. be it DTO's, entities, name value pairs whatever. This is the nature of the web, and there is nothing we can do except hack our best attempt at remediating this feature. Session sate is often miss used here to store these things, however this often causes memory issues and data loss down the track.
Viewstate is another alternative. Viewstate stores data serialises on the web page. This is good, but can sometime mean you pages become very big if an object graph becomes too large. The solution to this problem is to store the viewstate in the database.

Let my paint the picture. We are using entities with a complicated data model which comes from the database to a group of web parts which need to extract an exorbrtant amount of data needed for all the functionality offered, instead of just a small targeted subset. This object then comes down to the UI, which then get serialise into viewstate and set back to the database per request. This could be fine (although probably not just due to the size and complexity of the data) if each user had their own database. but they don't, and this means it does not matter how many server we chuck at this problem, it always comes back to 1 thing..... The one, single database.

Custom Caching
We used Entity framework 3.5 for this application (that was the wrong decision) and there were a few technical challenges along the way, one being caching. During the early stages to mid it was everyone belief that caching would just be plugged in right at the end, and all our problems would just disappear....
This was not the case, and to some genius thought process a new custom cache approach was devised. I must admit, it seem like it was going to work. However, through some undocumented "features" of EF 3.5, we hit a wall, and instead of trying to understand the problem and re-assess, and kept on banging the proverbial head. I think there comes a point we just need to stop.

Reporting direct from db
Another genius descision was to run reports directly from the database. This would be ok if the data model was simple, but when you need to join 30 tables and do a number of calcualtion to get a simple cost, it does end up having an impact on performance. It probably would not have even been an issue on a smaller (data size) complicated database, but with over 2 million records and counting (very rapidly) those computation and join become a very nasty bottleneck.


Web Services
Usually for an application such as this, you seperate the UI from the processing engine via web services, this means you can scale out when the performance waist line get a bit too big. Imagine the entire back end process locked into 1 process on 1 machine running underload, vs 1 process run a on x number of machine. Performance becomes a problem, just increase x and problem goes away (hopefully). However, this was not the case for this application. web services were not built into the application, and there was no way of expanding the performance waistline once processors started to overheat. I think this was quite a critical issue which was overlooked and severly hampered the scailability of the application.

Comments

Popular posts from this blog

What good looks like!!

A microservice journey - part 2: what type of micro service are you?

Validation Rules