Tony Liang's Blog

Monday, May 27, 2013

Liskov substitution principle

Recently I had a chance to upgrade a software component and get a better understanding about Liskov substitution principle. Here I want to share this experience.

First I want to give the clear definition of Liskov principle. This is the original Liskov principle definition:

Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.

The following is from Wiki:

Liskov substitution principle: objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program.

The famous violation to this principle is the Circle-ellipse problem (or similar square-rectangle problem). You can find the detail information in the Wiki definition.

Scenario

The component creates and manages our software license. Basically the code handles a chunk of binary memory, and the .NET BitArray is a good start. BitArray treats every bit as a boolean value and kind of changes managing memory to managing a list of boolean values, so it makes managing binary memory easier. But since BitArray is sealed, we cannot extend, so our base class just wraps up the BitArray class and adds some overloaded methods to make the operations easier. The base class looks like this:

    public class CommonBits
    {
        private BitArray _bits;

        public void Set (int index, bool value) { }
        public void Set (int index, byte value) { }
public void Set (int index, int value) { }
        ...
        public bool Get (int index) { }
        public int GetInt (int index) { }
        ...
     }

The derived class adds some secure operations to the base class. The derived class still handles the binary memory, but it adds CRC verification to the original bits array. The interface is like the following:

    public class SecureBits : CommonBits
    {
        public void Encode() { }
        public void SetCRC() { }
        public void GetCRC() { }
     }

SecureBits prefixes 32-bits CRC checksum to the original BitArray. We can use the following graph to demonstrate the difference between the 2 classes:

Problem
Now comes the problem. Let's suppose one user uses the CommonBits class, he/she just would simply do this:

        var bits = new CommonBits();
        bits.Set (1, true);
        bits.Set (100, false);

However, if you replace the above code with the derived class SecureBits, the problem comes. Since the first 32 bits data are calculated based on the following actual data, user cannot set any of the bit individually. The Set() method in the derived class should be something like this:

        public void Set (int index, bool value)
        {
            if (index < 32)
                throw new Exception ("The CRC value cannot be set. The value should be automatically calculated based on your actual bit array.");
            ...
        }

Apparently this code already violates Liskov principle "no new exceptions should be thrown".

Solution

My solution is change the inheritance to composition, which coincides with "favor composition over inheritance" principle. The code is something like this:

    public class SecureBits
    {
        public int CRC32Value {
            get
            {
                return CalculateCRC();
            }
            private set {}
        }
        public CommonBits BitsData { get; set; }
public CommonBits ToCommonBits() { }
     }

From the BitsData property, user can access all CommonBits operations, but he/she cannot set the CRC value. Also I provide ToCommonBits() method and user can converts the instance to a normal CommonBits instance. But from there on, if user changes some bits, the CRC value will not change accordingly. But it's useful when user wants to use other methods of CommonBits.

Wednesday, April 10, 2013

ObservableCollection performance issue

When a collection is needed for data binding in a WPF/Silverlight application, ObservableCollection is the class to use. Normally ObservableCollection is good enough to handle the data binding thing. However if the collection goes very big, and the performance becomes an issue, we may want to look for other options.

Background

I had a list which could contain up to more than ten thousands records. In my case, the file list brought some performance issues and took 20 seconds or so to fill the file list. In my personal opinion, 20 seconds is not terribly bad considering what you have to do in this time. I needed to do some really time-consuming work during this time. In order not to freeze GUI elements in this time span, I used 2 separate background workers to do the background work, respectively. Then the main thread only updates the GUI elements.

Although I found I could not cut much time from that 2 background threads, I did notice updating GUI took a long time and the list was refreshing so frequently. The issue is related with the design of the ObservableCollection. When I used ILSpy to check the ObservableCollection class, I found it has one CollectionChanged event and 2 PropertyChanged events. Every time an item is added or removed, those events are fired. So that means if ten thousand files are added, 30 thousand events are fired. These fired events will cause the GUI elements to refresh, which could be very time-consuming. In my case, real-time refreshing was not so necessary. The better solution is we only refresh GUI after all items are added or a fixed amount of items are added. This could save a lot of time.

RangeObservableCollection class

First, a change to ObservableCollection is a good start. We need the bulk add and delete operations. There are several code examples in the Internet, but this one from peteohanlon looks simple and neat. However, after checking this discussion, I found every add operation in peteohanlon's solution although doesn't fire CollectionChanged event, still fires PropertyChanged events. weston had very good points in that discussion. So I changed my RangeObservableCollection class to the following:

    public class RangeObservableCollection<T> : ObservableCollection<T>
{
        public void AddRange(IEnumerable<T> list)
        {
            if (list == null)
                return;

            foreach (T item in list)
                Items.Add(item);

            SendNotifications();
        }

        public void RemoveRange(IEnumerable<T> list)
        {
            if (list == null)
                return;

            foreach (T item in list)
                Items.Remove(item);

            SendNotifications();
        }

        private void SendNotifications()
        {
            OnCollectionChanged(new NotifyCollectionChangedEventArgs(NotifyCollectionChangedAction.Reset));
            OnPropertyChanged(new PropertyChangedEventArgs("Count"));
            OnPropertyChanged(new PropertyChangedEventArgs("Item[]"));
        }
    }

I also wrote some unit test methods to test how many events are fired and how the performance is. I could prove that my class only fired one CollectionChanged event and 2 PropertyChanged events for every bulk add. For the performance test, I just simply prepared a list which had 1 million records, then added this list to different range classes. In my test, adding one by one to ObservableCollection took 0.230 second, but AddRange to my RangeObservableCollection only took 0.072 second. When I tested peteohanlon's class, it almost took the same time with the traditional ObservableCollection class. So I guess PropertyChanged events do take some resources.

Again my test was just simple to test the collection operation, not related with any GUI updates. I guess the main advantage of bulk add and delete is we can save the GUI updating which could be critical to the performance. In data binding reality, GUI elements should respond to all the PropertyChanged and CollectionChanged events and cause the control to refresh, which could be a huge resource waste.

An Internal list

Furthermore, I used an internal list to keep all my data. After a fixed amount of files are added, I called AddRange() method to add them to the RangeObservableCollection instance. After all files are added, I called a Sort() method on my internal list to re-create the RangeObservableCollection instance. Here the overhead is re-create the collection instance.

When deleting, I always worked on the internal list, only when all queried files were deleted, I called the Sort() function and re-created the collection instance. Again a re-creation overhead happens here. In my test, deleting on an internal list was much faster than directly deleting from the data bound ObservableCollection instance.

Friday, January 18, 2013

FileSystemWatcher tips

Recently in one WPF project I needed to monitor multiple folders for the possible file and folder changes. I remembered in VC++ we had to use Win32 API to create our own thread data and run it in a different thread. In .NET world it seems FileSystemWatcher is the only and reasonable choice, and you don't have to run multiple threads by yourself. .NET framework will manage the monitoring thread for you. However, when I started using it, I found some issues which could eventually affect how and whether you can use it. I listed some concerns and tips here. We may reference this tips, which is some basic stuff you may be interested in. I may have another file to list the related code.

1. Some events will be fired multiple times.

When you rename a file, you could get several events fired. This is a known issue for file watchers. If you process the changes in the event handler, you could handle multiple times for one change. A good choice is to group the changes together, and then only process them once. So a Timer may be good in this situation. I will discuss Timer in another paragraph.

2. The monitored folder name change event is not fired.

When I debugged and found there was no events fired when the monitored folder was renamed, I was really frustrated. Actually, FileSystemWatcher does catch the event and furthermore changes the monitored folder to the new folder and starts monitoring the new folder. But you just cannot catch it. So this is the design behaviour, but apparent not what I wanted, because I needed to display the new folder name immediately. So to monitor the renaming event is a must.

The original thought came from this thread. The idea is to create another watcher to monitor the folder's parent folder, and only watch the directory name renaming event. Meanwhile you can specify a filter to only watch the sub folder you are interested in. The following is code to create the parent watcher.

            _parentWatcher = new FileSystemWatcher();
            _parentWatcher.Path = (Directory.GetParent(_watchedFolder)).FullName;
            string filter = _watchedFolder.Substring(_watchedFolder.LastIndexOf('\\') + 1);
            _parentWatcher.Filter = filter;
            _parentWatcher.IncludeSubdirectories = false;
            _parentWatcher.Error += Watcher_Error;
            _parentWatcher.NotifyFilter = NotifyFilters.DirectoryName;
            _parentWatcher.EnableRaisingEvents = true;

3. You cannot rename or delete the monitored folder's parent folders.

Let's say you are watching folder A. You are OK to change any files under A, rename folder A, and maybe delete A. But you can not rename A's parent B, B's parent C, and so on. Check this thread. When you try to rename it from Windows Explorer, you will get the following infamous message:
The action can't be completed because the folder or a file in it is open in another program

This is ridiculous since no file or folder is opened, just the folder is monitored. Again this is a design behaviour.

A workaround here could be you go ahead to watch your entire drive, let's say C:\ or D:\. As long as this guy doesn't have a parent, you don't worry about renaming a parent folder. But this probably brings performance issue, because you watch many unnecessary changes, especially for network drives.

4. Use a Timer to group multiple events

To be honest, I am not a fan of Timers. I always feel Timers are low classes in the system and not reliable. Maybe I am wrong, but I just have that feeling. But in some cases where missing an event is not that critical, Timers still can do the work. We should notice there are at least 3 timers: System.Timers.Timer, System.Threading.Timer, and System.Windows.Threading.DispatcherTimer. This thread discussing Timers may be useful. I used System.Timers.Timer in this case. Somebody mentioned we should use DispatcherTimer, but it turns out DispatcherTimer behaves differently with Timer.

When you respond to every event, you stop and restart the timer to wait for the same period, so this probably can group all changes together and fire the final change request.

Another thing needs to notice is in the Timer elapsed event, we should call

            _uiDispatcher.BeginInvoke(new Action(() => {
                Messenger.Default.Send("StartRefreshing", "StartRefreshing");
            }));

not just simply send the message.

Messenger.Default.Send("StartRefreshing", "StartRefreshing");

The difference is the later will send the message to a background thread but the former will send the event to the main ui thread. When you want to change UI stuff in the responding function, in the later case you will get

The calling thread cannot access this object because a different thread owns it

Because in WPF only the main GUI thread can change the GUI elements. We can use Dispatcher.Invoke or BeginInvoke to execute a delegate in the dispatcher thread. In the middle of this work, I wanted to use DispatcherTimer, but it turned out the timer is not fired. Check this thread and this thread to see the possible reason why this timer is not fired. The dispatcher timer is created in one thread and will only fire events in that thread and only the dispatcher of that thread can access these events.

5. Do we need explicit multiple threads?

Here comes another concern users normally have, do we need explicitly to put the file monitor to another thread? Check this discussion. The answer is no, because .NET framework will handle it. The class will create a thread if it needs that. So unless necessary, you don't have to create a thread to put the file monitor in it. This is different with the old-time Win32 way and of course a nice improvement.

Wednesday, October 24, 2012

Kill the session when the browser is closed

In one of my projects, I used Form Authentication to authenticate users against our Intranet domain by using LDAP. The good things are users don't have to remember another pair of user name and password, and form authentication gives users ability to log out. Users cannot log out if we use Windows Integrated Authentication.

By design, if users click something which they don't have permission, form authentication will automatically redirect them to the login page, and there users can log in with another account to access the desired resource. However in some use cases, users don't like this behaviour. They argue that it's rare that users have 2 accounts in the system. So if they don't have permission, we don't have to redirect them to the login page. We just need to display some messages indicating they need an admin to reassign them the proper permission. Here I just have a workaround when users are redirected to the login page, I will display some information on the login page to explain why they are showed the login page.

Another user requirement is users want to kill the session and log out when the browser is shut down. This requirement looks simple but It took me a long time to work on it. Even worse is so far I haven't gotten the proper way to resolve the problem. But I did get some workarounds in this stage.

When I searched from the internet, most people said you couldn't log out automatically. This is true in that the session is stored in the server, when users close the browser, the browser normally just closes and won't send a message to the server. So the server never knows the client is already gone.

But when you go to some bank websites, closing browser will log you out immediately. When you go to the same site again, the site always asks for your credentials. I don't know how exactly they are doing. However, I think they might use their own login mechanism. Users are always directed to the login page when users first open the website. When login page opens, the website always clear the old session and old credentials, so users have to enter their account again. Since in the middle, they won't be redirected to the login page, so this clearing won't bring trouble.

I referenced this blog and this discussion and this discussion to implement my workaround.

First created the Exit.aspx page. In the page load method, do the following:

    protected void Page_Load(object sender, EventArgs e)
    {
        FormsAuthentication.SignOut();
        Session.Abandon();

        // clear authentication cookie
        HttpCookie cookie = new HttpCookie(FormsAuthentication.FormsCookieName, "");
        cookie.Expires = DateTime.Now.AddYears(-1);
        Response.Cookies.Add(cookie);

        FormsAuthentication.RedirectToLoginPage();
    }
The code kills the session and logs out the user. I put the code in the page load event handler so this page will never be displayed.

Then in the normal pages add the following javascript code. Since I have a master page for all my pages, I can put this code in my master page so every page will be affected.

<script type="text/javascript" >
    var closeTab = false;
    window.onbeforeunload = checkBrowserBefore;
    window.onunload = checkBrowserUnload;

    function checkBrowserBefore() {
        if (window.event.clientY <= 0) {
            closeTab = true;
            return "Do you leave the application or stay?";
        }
        else
            closeTab = false;
    }
    function checkBrowserUnload() {
        if (closeTab)
            window.location.href = "../Exit.aspx";
    }
</script>
Notice here I check 2 events onbeforeunload and onunload. When users close the browser tab or browser, the onbeforeunload will fire. In IE a message box will pop up asking user to confirm the closing. If user clicks Yes, or in some new IE versions users choose Leave the page, the onunload will fire. Otherwise the onunload will not fire. That's the reason we have to check both events.

Again we should remember this is just a workaround and not guaranteed. This workaound only works in IE, but not even in all IE versions. In IE7, it works when users close either the IE tab or the entire IE browser. But in IE9, it works when users close the IE tab, but not working when users close the IE browser. I haven't tested on other IE versions or other browsers.

Also it is not working when users kill the browser process in task manager or use Alt + F4 to close it.

Sunday, March 14, 2010

DevTeach 2010 Toronto

DevTeach 2010 Toronto is held in Microsoft headquarter in Mississauga on Mar 8-12. The main conference is from Mar 9 to Mar 11, plus one day pre conference workshop and one day post conference workshop. The main conference attendees are eligible for one year MSDN subscription. That is probably why so many people attended the conference.

Comparing to TechDays, the sessions in DevTeach are more advanced. These sessions cover architecture, agile, silverlight, sharepoint, and general web and Windows development. I took the main conference and pre conference workshop. I don't think the pre one deserves the cost ($400 one day). The Silverlight workshop I took only simply shows a series of sessions which we probably already knew in some previous Techdays sessions. Plus the organizing is not professional.

I like the main conference. The presenters impressing me include Michael Stiefel, Kimberly Tripp, Donald Belcham, etc. Some architecture sessions are very attractive, most of which are related with interface, layer, design pattern, design principle, and ORM. A couple of SQL sessions from Kimberly Tripp are outstanding. I also took some sessions which focus on particular fields, such as jQuery, LINQ, SharePoint, and Web farm.

It's beneficial and important to know what the people in the community are thinking and doing. When you listen to people, you can refresh your memory and probably come out some new ideas. I hope this conference experience can help improve our in-house WPF and Silverlight projects.