If you've used both DataSet and DataTable, you must have seen the DataTable.Select method. This is an interesting method that allows to select rows using a set of criterias, like IS NULL and comparison operators referencing columns of the current table, as well as columns from other tables using relations. The problem with method is that is returns a DataRow[], on which you cannot perform an other select.

The solution is actually quite simple : Just copy the rows you'll answer me. Yes, but you can't just reference rows in two DataTable instances, so you also have to perform a deep copy of the rows. So, with a little digging in the DataTable methods, here is what you get :

public static DataTable Select(DataTable table, string filter, string sort)
{
   DataRow[] rows = table.Select(filter, sort);
   DataTable outputTable = table.Clone();

   outputTable.BeginLoadData();

   foreach(DataRow row in rows)
      outputTable.LoadDataRow(row.ItemArray, true);

   outputTable.EndLoadData();
   return outputTable;
}

Clone is used to copy the table schema only, BeginLoadData/EndLoadData to disable any event processing during the load operation, and LoadDataRow to effectively load each row. This seems to be a fairly fast way to copy a table's data.

Now, I wondered how they would do this in C# 3.0, since there is a lot of data manipulation with the new LINQ syntax. This version is quite interesting because instead of evolving the runtime, they chose to upgrade only the language by adding features that generate a lot of code under the hood. That was the case in C# 2.0 with iterators and anonymous methods. C# 1.0 also had this with foreach, using or lock for instance.

In the particular case of Linq, C# 3.0 generates a method invocation list  of a LINQ query, producing standard C# 3.0 code with the help of lambda expressions. For example, these two lines are equivalent :

   var query = from a in test where a > 2 select a;
   var query2 = Sequence.Where(test, a => a > 2);

This ties a little more the compiler to the system asssemblies, but this does not matter anymore.

By the way, you can apply queries to standard arrays and join them :

static void Main(string[] args)
{
   var names = new[] {
      new { Id=0, name="test" },
      new { Id=1, name="test1" },
      new { Id=2, name="test2" },
      new { Id=4, name="test2" },
   };

   var addresses = new[] {
      new { Id=0, address="address" },
      new { Id=1, address="address1" },
      new { Id=2, address="address2" },
      new { Id=3, address="address2" },
   };

   var query = from name in names
      join address in addresses on name.Id equals address.Id
      orderby name.name
      select new {name = name.name, address = address.address};

   foreach(var value in query)
      Console.WriteLine(value);
}

I've joined the two arrays using the Id field, and creating a new type that extracts both name and address. I really like inline querying because you can query anything that implements IEnumerable.

I'm also wondering how it'll fit into eSQL (Entity SQL)...

But back to the original subject of this post. They had to do some kind of a DataTable copy in the C# 3.0 helper library, which uses extension methods :

   DataTableExtensions.ToDataTable<T>(IEnumerable<T>)

And with some further digging, I found that the LoadDataRow method for copying data is the fastest way to go.

I also found out using the great reflector that there is an Expression compiler in System.Expressions.Expression<T>. Maybe they finally did expose an expression parser that we can use... I'll try this one too !

</font></font></span>