Proposal for Linear Algebra: Tensor core (Looking for the mentor)

瀏覽次數:98 次
跳到第一則未讀訊息

zhiq...@gmail.com

未讀,
2019年3月27日 凌晨3:13:492019/3/27
收件者:sympy

Hello,


My name is Zhiqi KANG, I am a 4th year undergraduate of a 5-year engineering institution: Université de Technologie de Compiègne, France. I am interested in the project Linear algebra: Tensor core.Here is the link for the description of project idea. https://github.com/sympy/sympy/wiki/GSoC-2019-Ideas#linear-algebra-tensor-core Even though I am not very familiar with the tensor in the physical field, its principal in the mathematical field is quite interesting. I have precisely looked at the requirement of this project and make sure that I am capable to accomplish most of its task. However, there are still many questions that I would like to discuss with all contributors of SymPy and expecially with the mentor. One urgent problem is that I don't find the name of mentor for this project, so I don't really know who I should CC. Could you please help me to find the mentor for this project?  


Please review this draft proposal and tell me what to be ameliorated. Thank you!



Ø  Better Algorithms for sparse array:

The idea is to manipulate directly les arrays in the sparse array level. Casting sparse arrays to a dense array and then operating is kind of a redundancy. I have found an example in tomatrix() function of sympy\sympy\tensor\array\sparse_ndim_array.py where we convert the sparse_array to a new dictionay and then cast it to matrix.(Code bellow)

But I cannot find more cases in the array/tensor module, it would be great if some one can help me find out where other cases are.


from sympy.matrices import SparseMatrix
if self.rank() != 2: 
  raise ValueError('Dimensions must be of size of 2')
mat_sparse = {}
for key, value in self._sparse_array.items(): 
  mat_sparse[self._get_tuple_index(key)] = value
return SparseMatrix(self.shape[0], self.shape[1], mat_sparse) 

 

Ø  NumPy-like operations

We have now some operations for arrays in SymPy:

²  arrayfy

²  tensor product

²  derivatives by array

²  permute dimension

²  contraction

For this part of project, I am planning to implement some operations such as:

²  sum

²  divide/multiply(element wise)

²  any

²  comparators(greater/less/equal)

²  logical operator(and/or/not/xor)

²  random

       

Ø  lazy operators on arrays

lazy evaluation can improve the performance while iterating the array since it creates value only if it is called. To implement lazy operators, I am thinking about two plans:

1.      Create a new sub-module named lazy-array (larray) of which most of the operations are lazy evaluated. A standard Array can be cast to a lazy-array by simply calling the constructor of larray and passing it as parameter. By doing so, users can choose whatever they want in the module level, which means that to manipulate a simple array or a lazy array.

 

1.      Create a lazy version for les operators mentioned above. The lazy operators are accessible for a specific purpose. This implementation focusses on a function level for calling lazy evaluated operations, which means to call a simple inverse_matrix function or a lazy one.

 

Besides, I have found in sympy\sympy\tensor\tensor.py a class _TensorDataLazyEvaluator which can be an example for me to implement these functionalities. It has methods like delete item, inverse matrix, etc.

 

Ø  code generation for arrays and array operators 

This part of project should be involved with another GSoC project purely for code generation. I would like to discuss with the mentor of the codegen project to have a better point of view for it.

I have had an internship for 6 months in BNP Paribas Securities Services in Paris as developer. During this period, I have similarly worked on code generation task, except that the programming language is C#.( I was using EntityFramework and T4 by Microsoft) I believe that this experience can help me to get familiar with the code generation process in this project.

 

Ø  Integration over indexed symbols and arrays

Firstly, I would like to talk about integration over arrays:

Can we imagine the array as a set of coordinates? Suppose that we have a array A, say 2-dimension as (p, q). We can image two axis x and y so that index i and j are coordinates for the point Pij in axis x and axis y. The value A[pi,qj] should be the coordinate of axis z. By assuming this, we can use a Riemann integral or Lebesgue integral to calculate its integration like summing the column in the 3D space.

I don’t know if this idea is correct, I would love to discuss it with you!

 

Secondly, for integration over indexed symbols, I don’t really know what it means. Should the output be an expression rather than a value? It would be great if someone can show me with an example, thanks!

 

Ø  Equation solving with indexed symbols.

I am not very familiar with this topic either. Should the result be an expression as well? It would be great if someone can show me with an example, thanks!


Ø  Implement some well-known tensor math

If the time permits, I would be glad to do the extra part of this project. But I don’t know very well relativity, electromagnetism, etc. It would take me some time to better understand the principles and start to work on it. However, I do find some math formula that associated with this topic. https://en.wikipedia.org/wiki/Integral


Ø  Unify the various SymPy module

To be done.


Jason Moore

未讀,
2019年3月28日 中午12:14:462019/3/28
收件者:sympy
Zhiqi,

This looks well thought out a first glance. Check out https://github.com/sympy/sympy/wiki/GSoC-2019-Student-Instructions if you haven't yet.

Jason

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/a5ff49d3-63e4-45e5-b87f-4b9d6be30085%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
訊息已遭刪除

zhiq...@gmail.com

未讀,
2019年3月28日 中午12:28:042019/3/28
收件者:sympy
Hello Jason,

Thank you! It seems that you are quite busy now greeting to the newbies. :-)
I am waiting for someone to review my proposal. Could you please tell me to whom I should reach out? 

Regards,
Zhiqi

在 2019年3月28日星期四 UTC+1下午5:14:46,Jason Moore写道:
Zhiqi,

This looks well thought out a first glance. Check out https://github.com/sympy/sympy/wiki/GSoC-2019-Student-Instructions if you haven't yet.

Jason

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sy...@googlegroups.com.

Jason Moore

未讀,
2019年3月28日 中午12:29:382019/3/28
收件者:sympy
Francesco is the most knowledgeable about this subject.

Jason

On Thu, Mar 28, 2019 at 9:24 AM <zhiq...@gmail.com> wrote:
Hello Jason,

Thank you! It seems that you are quite busy now greeting to the newbies. :-)
I am waiting for someone to review my proposal. Could you please tell me to whom I should reach out? 

Regards,
Zhiqi

在 2019年3月28日星期四 UTC+1下午5:14:46,Jason Moore写道:
Zhiqi,
To unsubscribe from this group and stop receiving emails from it, send an email to sy...@googlegroups.com.

To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/a5ff49d3-63e4-45e5-b87f-4b9d6be30085%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.

zhiq...@gmail.com

未讀,
2019年3月28日 下午1:20:092019/3/28
收件者:sympy
(updated: Lazy operators)

        Ø  lazy operators on arrays

lazy evaluation can improve the performance while iterating the array since it creates value only if it is called. To implement lazy operators, I am thinking about two plans:

1.      Create a new sub-module named lazy-array (larray) of which most of the operations are lazy evaluated. A standard Array can be cast to a lazy-array by simply calling the constructor of larray and passing it as parameter. By doing so, users can choose whatever they want in the module level. 

       (Updated: a class diagram for Lazyarray.) 

newmodule.PNG


 

1.      Create a lazy version for les operators mentioned above. The lazy operators are accessible for a specific purpose. This implementation focusses on a function level for calling lazy evaluated operations.

       (Updated: a class diagram for the inner Lazyevaluation. Example brorrowed from Lazyevaluation

innerClass.PNG


(updated: analysis for this approaches)

Personnally I think the 1) approach is better because this implementation can enable a independency between the existing Array module and the LazyArray. It will also benefit the test and maintenance for the LazyArray. I suppose that the Lazyarray to Array should be as Numpy to List, Lazyarray should be seen as a toolkit. 

But the disadvantage is that the mixte of standard Array and Lazyarray object may lead to some confusion.(e.g. Lazyarray.sum(Array()) should return a Lazyarray or Array object?) The problem can probably be solved by a convention like: afer casting an Array to Lazyarray, one would keep using Lazyarray. Of course if needed, the transformation from Lazyarray to Array should be provided at any time. 

Besides, the Lazy-evaluation should be transparent to the users, so that we are supposed to keep the naming convention to operators as we have for Array.

Aaron Meurer

未讀,
2019年3月28日 下午2:41:472019/3/28
收件者:sympy
Can you clarify what a lazy array would look like? I think this may already be implemented as Indexed.

Aaron Meurer

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.

zhiq...@gmail.com

未讀,
2019年3月28日 下午4:28:042019/3/28
收件者:sympy
Hello Aaron,

I am thinking about something like this: 

class Lazyarray(object):
"""
Lazy-evaluated array
"""
       def __init__(self, array):
"""
Initiate a Lazyarray with an array
"""
self.operations = []
def evaluate(self, f=None):
"""
Evaluate all operations stored in the operations list.If parameter 'f'
is None then evaluate all operations, otherwise evaluate only the
function 'f' 
Some functions will use generator object for some iteration purposes 
in order to save the memory cost
"""

def _add_operation(self, op):
"""
Add an operation to the list of operations
"""
def sum():
"""
Add the operator Sum to the list of operations, transparent to the users
"""
        def _sum()
"""
An inner function that will be called by evaluate() to perform a sum. A 
real implementation of sum operator.
"""
Will it be valide? Or it is already implemented?

zhiq...@gmail.com

未讀,
2019年3月28日 晚上7:01:512019/3/28
收件者:sympy
Hello Aaron,

I found the Indexed class that you talked about.  In my opinion, there is a difference between Indexed and Lazyarray. The Indexed is an abstraction of mathematical object with indices, that is a model to represent the tensor in general; while the lazy array is to perform a lazy evaluation for the array operators in order to reduce the time or memory cost. The goal is different. But I am not sure if this will be a redundancy once we have a global look at Tensor module. I would like to listen to your opinion. :-) 

在 2019年3月28日星期四 UTC+1下午7:41:47,Aaron Meurer写道:

Aaron Meurer

未讀,
2019年3月28日 晚上8:27:182019/3/28
收件者:sympy
I think the way to do this is to have operations on array objects that
don't evaluate by default. This would be very similar to the matrix
expressions code, where something like MatMul or MatAdd doesn't
evaluate, even on explicit matrices, unless you call doit() on it
(doit() is what SymPy generally calls your evaluate()).

The actual array type would be the same. The thing that needs to be
lazy is the operation, not the array itself.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/23f71da1-d854-4325-b3c0-5e0ce322854f%40googlegroups.com.

zhiq...@gmail.com

未讀,
2019年3月29日 下午3:46:022019/3/29
收件者:sympy
Hello,

Ok, I get it! So this lazy operation is an extension of the precedent tasks which are implementation of numpy like operators. As you say, theses operators should be added inside the array.I would update the proposal asap.
Besides, I would like to discuss with you about the integration over arrays.  Do you think that the idea about integrating over arrays is correct? For integration over index symbols, it would be great if you could help find an example.
Another urgent problem is that it seems the Linear Algebra topic is not very attractive in SymPy, I am worried about if I can find a mentor who can guide in improving the proposal and during the summer. Could you please recommend someone? 
I would continue to update the proposal in a better google-drive version.

Thanks a lot for your replies! It help me better understand what the project will be like. Thank you!

Regards,
Zhiqi

在 2019年3月29日星期五 UTC+1上午1:27:18,Aaron Meurer写道:

zhiq...@gmail.com

未讀,
2019年3月31日 晚上7:05:382019/3/31
收件者:sympy
Hello Aaron,

I have properly updated my proposal in this link https://docs.google.com/document/d/1w_-hWrrT89Dr70Xo1TPkQdU80UIVku0Y9cNHADgdfJg/edit?usp=sharing  Could you please review it and give me some advice if you are availble? I would sincerely appreciate it. 
I will continue to update it because it is not yet complete.

@Everyone
I hope you have had a wonderful week-end. :-)
I would like to ask for your help as well to review my proposal and tell me what to improve or clarify. 
Thanks in advance!

Regards,
Zhiqi 

Aaron Meurer

未讀,
2019年4月1日 下午2:51:292019/4/1
收件者:sympy
One thing that worries me is that we shouldn't be reimplementing NumPy
or other numeric libraries inside of SymPy. So something should be put
in SymPy if it has a use for symbolic mathematics. Some things in your
application it isn't clear if they fall into this category or not. I
would try to clarify if they are, or if they aren't, they should be
removed.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/b1a94123-eac4-41cb-bd3d-a509368b0204%40googlegroups.com.

zhiq...@gmail.com

未讀,
2019年4月1日 下午4:32:172019/4/1
收件者:sympy
Hello Aaron,
Thank you for you replies.
I think a have made a terrible mistake while preparing fot this project.
In fact it was the sentence in the description of project idea
There are two branches: indexed symbols (which do not contain components data), and array-like objects (containing component data).
that made me misunderstand the purpose of implementation for array-like object. So what does it mean by saying "contain compenents data"?  I thought the difference between indexed symbols and array is that the latter one has numeric value. But it turns out not.Could you please clarify a little bit more?

In the same time I will try to modify my proposal so that there would at least not be nothing purely numeric. I think it is the Numpy-like operation and integration over array that I should modify at first.

Thank you for your help!

Regrads,
Zhiqi   

在 2019年4月1日星期一 UTC+2下午8:51:29,Aaron Meurer写道:

zhiq...@gmail.com

未讀,
2019年4月1日 晚上7:26:022019/4/1
收件者:sympy
Hello Aaron, 

I just updated my proposal by modifying what I can understand for now. Things that I have updated are:
 1. Modifying and making sure the new operators for arrays are for symbolic mathematics
 2. Adding a second part of this task by refering to an issue #15464  in which you talked about the change of SymPy syntax to be compatible with NumPy. (Am i doing it right by refering to this issue?)

I would like to have some clarification about: 
 1. what the indexed symbols and array-like object are, as I mentioned in the last email. It would help me figure out how to do the integration over arrays or indexed symbols
 2. the syntax of array in SymPy that is supposed to be changed, since you were in the discussion as well. Most of the examples were about Matrix, there is little information about array's problem.

I am sorry for sending you double emails all the time. Please forgive me if it disturbs you, I was trying to be as reactif as possible.
 
Thank you in advance!

Regards,
Zhiqi 


在 2019年4月1日星期一 UTC+2下午8:51:29,Aaron Meurer写道:
One thing that worries me is that we shouldn't be reimplementing NumPy

Aaron Meurer

未讀,
2019年4月1日 晚上8:40:032019/4/1
收件者:sympy
On Mon, Apr 1, 2019 at 5:26 PM <zhiq...@gmail.com> wrote:
>
> Hello Aaron,
>
> I just updated my proposal by modifying what I can understand for now. Things that I have updated are:
> 1. Modifying and making sure the new operators for arrays are for symbolic mathematics
> 2. Adding a second part of this task by refering to an issue #15464 in which you talked about the change of SymPy syntax to be compatible with NumPy. (Am i doing it right by refering to this issue?)
>
> I would like to have some clarification about:
> 1. what the indexed symbols and array-like object are, as I mentioned in the last email. It would help me figure out how to do the integration over arrays or indexed symbols

I think you are understanding correctly. The indexed symbols are like
Indexed or MatrixSymbol, which can have symbolic shape and indices,
and do not have any explicit entries set. Whereas array would mean
ImmutableDenseNDArrray and Matrix.

My point about "numeric" is specifically numbers. So an array
containing symbolic expressions isn't numeric, just like Matrix([x])
where x = Symbol('x'). You can also put SymPy symbolic values in a
NumPy array (dtype=object). So if the SymPy array type just emulated
the NumPy type there would be little point to it. But one advantage it
can have is unevaluated array expressions.

> 2. the syntax of array in SymPy that is supposed to be changed, since you were in the discussion as well. Most of the examples were about Matrix, there is little information about array's problem.

I'm not sure what you are referencing here. Probably Francesco would
be better to answer this.

Aaron Meurer

>
> I am sorry for sending you double emails all the time. Please forgive me if it disturbs you, I was trying to be as reactif as possible.
>
> Thank you in advance!
>
> Regards,
> Zhiqi
>
>
> 在 2019年4月1日星期一 UTC+2下午8:51:29,Aaron Meurer写道:
>>
>> One thing that worries me is that we shouldn't be reimplementing NumPy
>> or other numeric libraries inside of SymPy. So something should be put
>> in SymPy if it has a use for symbolic mathematics. Some things in your
>> application it isn't clear if they fall into this category or not. I
>> would try to clarify if they are, or if they aren't, they should be
>> removed.
>>
>> Aaron Meurer
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/ff0a5611-9cd5-40d6-b0ea-d3f0577fae35%40googlegroups.com.

zhiq...@gmail.com

未讀,
2019年4月2日 晚上7:07:012019/4/2
收件者:sympy
Hello Aaron,

First of all, thank you for this clarification! It makes the terms much more understandable.
But there is one thing that seems confusing to me. Since NumPy is a numeric library and SymPy is a symbolic one, what is the point to have a NumPy-like array in this case? Does it mean the behavior that you discussed in the issue #15464 (link)? Or the algorithm that is used in NumPy?

Besides, I have discussed with Francesco today. He explained many details to me and helped me a lot. He told me that his initial idea is to create a lazy-evaluation mechanism in the \sympy\codegen\array_util.py(link), do you think it is necessary to have the lazy-evaluation in the Array module as well? That is what I had in mind the first time I saw the project idea. 

Thank you in advance!

Regards,
Zhiqi

在 2019年4月2日星期二 UTC+2上午2:40:03,Aaron Meurer写道:
訊息已遭刪除

zhiq...@gmail.com

未讀,
2019年4月6日 下午4:31:172019/4/6
收件者:sympy
(Final version of proposal)
Hello everyone,

I have updated the final version of my proposal. Could anyone take a look at it and give me a feed back please? Here is the link: proposal Tensor core
回覆所有人
回覆作者
轉寄
0 則新訊息