Metaprogramming – any tips?

Discussions on everything related to the software, electronic, and mechanical components of information systems and instruments.

Metaprogramming – any tips?

Postby Natural ChemE on June 9th, 2013, 9:19 pm 

I’ve been writing some programs that write other programs. Everything in this post is in C#, assuming .NET 4.5, and dynamic compilation is done with CodeDOM.

As a primary example, here’s part of what one of my programs wrote to solve a system of nonlinear equations using a gradient descent method (line search, convergence check, etc. not included):
Code: Select all
for (int i = 0; i < iterationCount; ++i)
{
   conc_0 = initConc_0-2D*xi_0-2D*xi_1-xi_2-xi_3-xi_4-xi_5-xi_7;
   conc_1 = initConc_1+xi_4+xi_5;
   conc_2 = initConc_2-xi_1;
   conc_3 = initConc_3+xi_6;
   conc_4 = initConc_4-xi_4;
   conc_5 = initConc_5+xi_0+xi_1+xi_2+xi_3+xi_4+xi_7;
   conc_6 = initConc_6-xi_6;
   conc_7 = initConc_7-xi_3;
   conc_8 = initConc_8+xi_3;
   conc_9 = initConc_9-xi_5;
   conc_10 = initConc_10+xi_1-xi_2+xi_5;
   conc_11 = initConc_11+xi_0+xi_6;
   conc_12 = initConc_12+xi_2;
   conc_13 = initConc_13+xi_7;
   conc_14 = initConc_14-xi_7;
   f_0 = conc_5*conc_11 - beta_0 * (conc_0*conc_0);
   f_1 = conc_5*conc_10 - beta_1 * (conc_0*conc_0*conc_2);
   f_2 = conc_5*conc_12 - beta_2 * (conc_0*conc_10);
   f_3 = conc_5*conc_8 - beta_3 * (conc_0*conc_7);
   f_4 = conc_1*conc_5 - beta_4 * (conc_0*conc_4);
   f_5 = conc_1*conc_10 - beta_5 * (conc_0*conc_9);
   f_6 = conc_3*conc_11 - beta_6 * (conc_6);
   f_7 = conc_5*conc_13 - beta_7 * (conc_0*conc_14);
   J_0_0 = ((conc_11)+(conc_5)) +  beta_0 * (4D*(conc_0));
   J_0_1 = ((conc_11)) +  beta_0 * (4D*(conc_0));
   J_0_2 = ((conc_11)) +  beta_0 * (2D*(conc_0));
   J_0_3 = ((conc_11)) +  beta_0 * (2D*(conc_0));
   J_0_4 = ((conc_11)) +  beta_0 * (2D*(conc_0));
   J_0_5 =  beta_0 * (2D*(conc_0));
   J_0_6 = ((conc_5));
   J_0_7 = ((conc_11)) +  beta_0 * (2D*(conc_0));
   J_1_0 = ((conc_10)) +  beta_1 * (4D*(conc_0*conc_2));
   J_1_1 = ((conc_10)+(conc_5)) +  beta_1 * (4D*(conc_0*conc_2)+(conc_0*conc_0));
   J_1_2 = ((conc_10)-(conc_5)) +  beta_1 * (2D*(conc_0*conc_2));
   J_1_3 = ((conc_10)) +  beta_1 * (2D*(conc_0*conc_2));
   J_1_4 = ((conc_10)) +  beta_1 * (2D*(conc_0*conc_2));
   J_1_5 = ((conc_5)) +  beta_1 * (2D*(conc_0*conc_2));
   J_1_7 = ((conc_10)) +  beta_1 * (2D*(conc_0*conc_2));
   J_2_0 = ((conc_12)) +  beta_2 * (2D*(conc_10));
   J_2_1 = ((conc_12)) +  beta_2 * (2D*(conc_10)-(conc_0));
   J_2_2 = ((conc_12)+(conc_5)) +  beta_2 * ((conc_10)+(conc_0));
   J_2_3 = ((conc_12)) +  beta_2 * ((conc_10));
   J_2_4 = ((conc_12)) +  beta_2 * ((conc_10));
   J_2_5 =  beta_2 * ((conc_10)-(conc_0));
   J_2_7 = ((conc_12)) +  beta_2 * ((conc_10));
   J_3_0 = ((conc_8)) +  beta_3 * (2D*(conc_7));
   J_3_1 = ((conc_8)) +  beta_3 * (2D*(conc_7));
   J_3_2 = ((conc_8)) +  beta_3 * ((conc_7));
   J_3_3 = ((conc_8)+(conc_5)) +  beta_3 * ((conc_7)+(conc_0));
   J_3_4 = ((conc_8)) +  beta_3 * ((conc_7));
   J_3_5 =  beta_3 * ((conc_7));
   J_3_7 = ((conc_8)) +  beta_3 * ((conc_7));
   J_4_0 = ((conc_1)) +  beta_4 * (2D*(conc_4));
   J_4_1 = ((conc_1)) +  beta_4 * (2D*(conc_4));
   J_4_2 = ((conc_1)) +  beta_4 * ((conc_4));
   J_4_3 = ((conc_1)) +  beta_4 * ((conc_4));
   J_4_4 = ((conc_5)+(conc_1)) +  beta_4 * ((conc_4)+(conc_0));
   J_4_5 = ((conc_5)) +  beta_4 * ((conc_4));
   J_4_7 = ((conc_1)) +  beta_4 * ((conc_4));
   J_5_0 =  beta_5 * (2D*(conc_9));
   J_5_1 = ((conc_1)) +  beta_5 * (2D*(conc_9));
   J_5_2 = ((conc_1)) +  beta_5 * ((conc_9));
   J_5_3 =  beta_5 * ((conc_9));
   J_5_4 = ((conc_10)) +  beta_5 * ((conc_9));
   J_5_5 = ((conc_10)+(conc_1)) +  beta_5 * ((conc_9)+(conc_0));
   J_5_7 =  beta_5 * ((conc_9));
   J_6_0 = ((conc_3));
   J_6_6 = ((conc_11)+(conc_3)) +  beta_6 * (1D);
   J_7_0 = ((conc_13)) +  beta_7 * (2D*(conc_14));
   J_7_1 = ((conc_13)) +  beta_7 * (2D*(conc_14));
   J_7_2 = ((conc_13)) +  beta_7 * ((conc_14));
   J_7_3 = ((conc_13)) +  beta_7 * ((conc_14));
   J_7_4 = ((conc_13)) +  beta_7 * ((conc_14));
   J_7_5 =  beta_7 * ((conc_14));
   J_7_7 = ((conc_13)+(conc_5)) +  beta_7 * ((conc_14)+(conc_0));
   xi_0 -= stepSize * f_0 * (J_0_0 + J_0_1 + J_0_2 + J_0_3 + J_0_4 + J_0_5 + J_0_6 + J_0_7);
   xi_1 -= stepSize * f_1 * (J_1_0 + J_1_1 + J_1_2 + J_1_3 + J_1_4 + J_1_5 + J_1_7);
   xi_2 -= stepSize * f_2 * (J_2_0 + J_2_1 + J_2_2 + J_2_3 + J_2_4 + J_2_5 + J_2_7);
   xi_3 -= stepSize * f_3 * (J_3_0 + J_3_1 + J_3_2 + J_3_3 + J_3_4 + J_3_5 + J_3_7);
   xi_4 -= stepSize * f_4 * (J_4_0 + J_4_1 + J_4_2 + J_4_3 + J_4_4 + J_4_5 + J_4_7);
   xi_5 -= stepSize * f_5 * (J_5_0 + J_5_1 + J_5_2 + J_5_3 + J_5_4 + J_5_5 + J_5_7);
   xi_6 -= stepSize * f_6 * (J_6_0 + J_6_6);
   xi_7 -= stepSize * f_7 * (J_7_0 + J_7_1 + J_7_2 + J_7_3 + J_7_4 + J_7_5 + J_7_7);
}
In this code, J_7_4 refers to the Jacobian term .

I really like this approach because those equations would’ve been far, far nastier if the program that generated that method generated it as a fully general approach.

Coding like this has left me with a few questions that I’d like to ask.

Questions:
1. In the above snippet, arrays, matrices, etc. were completely eliminated by writing their elements as individual variables. i.e., there was no Jacobian matrix with elements , only a lot of variables J_i_j. Since the program that generates this code could easily write it either way, is this actually faster? Note: I’m far more concerned with speed than memory.

2. In the above snippet, there was the line
Code: Select all
J_0_6 = ((conc_5));
. I’m 99.999% sure that, after compilation, this becomes identical to
Code: Select all
J_0_6 = conc_5;
, but just to make sure, the ()’s don’t hurt performance since the .NET 4.5 complier completely removes all ()’s after order of operations is determined, correct?

3. In the above snippet, the code existed in a loop
Code: Select all
for (int i = 0; i < iterationCount; ++i){ /* code here */ }
. If iterationCount is known before this code is generated, it could instead be generated without the loop by copying the code as many times as it would’ve been looped through. What effect would this have on the program compiled from the generated code? i.e., it’d eliminate the need to iterate i, but would it also have some other weird side effect such as consuming that many times more cache in the CPU, potentially forcing the method into the RAM, slowing the program down?

4. Fortran should be faster than C# for purely computational stuff like this, right? Anyone know a clean way of getting Fortran to write code that’d link into a C# program like this?

5. Any tips?
Natural ChemE
Forum Moderator
 
Posts: 2744
Joined: 28 Dec 2009


Re: Metaprogramming – any tips?

Postby phillipsshawn on October 29th, 2013, 2:03 am 

What is the difference between ++i and i++ ??
phillipsshawn
 


Re: Metaprogramming – any tips?

Postby Natural ChemE on October 29th, 2013, 2:42 am 

phillipsshawn,

"++" does two things:
  • increments the variable's value;
  • returns the variable's value.
When prefixed, e.g. ++i, it increments then returns.
When suffixed, e.g. i++, it returns then increments.

Prefix:
Code: Select all
int i = 0;
int x = ++i;  //now i = 1 and x = 1
Here x got i's value after i was incremented, so x is one.

Suffix:
Code: Select all
int i = 0;
int x = i++;  //now i = 1 and x = 0
Here x got i's value before i was incremented, so x is just zero.

I had a Physics professor who was always on about using the prefix whenever possible. His reasoning was that, because the suffix requires storing the pre-incremented value whereas the prefix doesn't, the prefix version is more efficient. And I think that he's right, assuming that the compiler doesn't fix it upon optimization.

But if someone's using a compiler with such poor optimizations that it doesn't pick up something so trivial, I doubt that modifying their code like this is really the route to getting more efficient binaries.

Anyway,
Code: Select all
for (int i = 0; i < iterationCount; ++i){ /* code here */ }
for (int i = 0; i < iterationCount; i++){ /* code here */ }
both do the same thing. Since the loop doesn't check i's return value and only cares that it was incremented, the order in which it does these operations doesn't matter. Except, as my Physics professor would've pointed out, the prefixed version would be more efficient if the compiler doesn't optimize.
Natural ChemE
Forum Moderator
 
Posts: 2744
Joined: 28 Dec 2009


Re: Metaprogramming – any tips?

Postby Venus on October 29th, 2013, 11:58 am 

Natural ChemE wrote:Any tips?

First, you are reinventing the wheel.

Why not use Mathematica, Matlab or Maple for this?

Second, C# is not meant for number crunching. C# compiles into a byte code which is an interpreted language. If you must insist on using C# there are array processing routines in .NET, also you might want to take a look at F# which integrates with C# nicely, it won't be faster but the code will look more elegant.

Worrying about the difference between i++ and ++i in a language like C# does not make any sense.
Venus
 


Re: Metaprogramming – any tips?

Postby Natural ChemE on October 30th, 2013, 3:31 pm 

Venus,

Heh, yeah, I usually make threads like this shortly after starting a new project, when I still have no idea what the heck I’m doing. This one’s come a long way since then so the post’s a bit dated.

Do you like the whole analytical thing? Math, programming, etc.?
Natural ChemE
Forum Moderator
 
Posts: 2744
Joined: 28 Dec 2009


Re: Metaprogramming – any tips?

Postby Venus on October 30th, 2013, 4:21 pm 

Natural ChemE wrote:Venus,

Heh, yeah, I usually make threads like this shortly after starting a new project, when I still have no idea what the heck I’m doing. This one’s come a long way since then so the post’s a bit dated.

Do you like the whole analytical thing? Math, programming, etc.?

No problem, as long as your have fun :)

I presently only program for fun, Prolog, CLP, some functional programming languages.
Venus
 



Return to Computers

Who is online

Users browsing this forum: No registered users and 4 guests