Hi,I think our preference is to stick to Scala.
On 02/24/2014 11:27 PM, Christopher Medrela wrote:
Hello!
My name is Christopher Medrela and I'd like to participate in GSoC working
on a
Scala project. Last year I was working at Django (my mentor was Russell
Keith-Magee) and I successfully finished my project (revamping validation
framework). This year I'm also applying to Django, but I'd like to try
something
else.
I'm fluent in Python and I have a lot of interest in Scala. I think that
Scala
skills wouldn't be the biggest problem but the lack of knowledge of Scala
frameworks and libraries. I have basic knowledge of numpy and pandas
libraries.
My English turned out to be good enough to discuss in real time via skype.
Therefore, I'd like to work on some components/library/framework independent
from others. Can be math-heavy. Can be mix of Python and Scala.
I will let the authors of ideas to speak about that but....
I fished out two projects: "multidimensional arrays" and "visualization
library". In next days I will focus on the first one.
Do you think this is a good project? Maybe there are some better given my
skills? Or maybe this project is not worth much to Scala community and it
would
be better to focus on the another one?
... that might be a problem. There are some strict rules/deadlines imposed on us by Google but depending on the situation maybe something could be done. I will leave the decision to Tobias but we had some bad experience with students trying to combine two things at the same time so I think we will hesitate to accept students who have another job that overlaps with GSoC. It's good that you are asking now rather than letting us know later though (the latter did happen in the past and is definitely not cool).
The second issue is that I'm going to have an internship in the late
summer. I
will apply only for these ones which won't clash with GSoC. Therefore I'd
like
to start coding as early as possible. Is it possible to shift internal
dates so
I could start coding earlier (i.e. when the list of accepted students will
be
published -- that is 21 April)?
hubert
--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-language+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
... that might be a problem. There are some strict rules/deadlines imposed on us by Google but depending on the situation maybe something could be done. I will leave the decision to Tobias but we had some bad experience with students trying to combine two things at the same time so I think we will hesitate to accept students who have another job that overlaps with GSoC. It's good that you are asking now rather than letting us know later though (the latter did happen in the past and is definitely not cool).
The second issue is that I'm going to have an internship in the late
summer. I
will apply only for these ones which won't clash with GSoC. Therefore I'd
like
to start coding as early as possible. Is it possible to shift internal
dates so
I could start coding earlier (i.e. when the list of accepted students will
be
published -- that is 21 April)?
I also share Hubert's concerns about your timeline. The EPFL guys have more experience with this than me.
... that might be a problem. There are some strict rules/deadlines imposed on us by Google but depending on the situation maybe something could be done. I will leave the decision to Tobias but we had some bad experience with students trying to combine two things at the same time so I think we will hesitate to accept students who have another job that overlaps with GSoC. It's good that you are asking now rather than letting us know later though (the latter did happen in the past and is definitely not cool).
The second issue is that I'm going to have an internship in the late
summer. I
will apply only for these ones which won't clash with GSoC. Therefore I'd
like
to start coding as early as possible. Is it possible to shift internal
dates so
I could start coding earlier (i.e. when the list of accepted students will
be
published -- that is 21 April)?
I'm going to apply for these internships which won't clash with GSoC. Thatmeans, that I'm *not* going to do two things at the same time. I won't have anyjob/holiday during GSoC (except for classes, of course).
I also share Hubert's concerns about your timeline. The EPFL guys have more experience with this than me.I've asked Carol if Google is against internally shifting rules. [1] She said:"We don't police what you deliver to your org and when, simply that you meet themilestones of the program as laid out." Since I'd like to start earlier, notlater, deadlines and milestones of the program are not a problem.
--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
I'm trying to find a point to start from. Unfortunately, there is no tickettracker, no todo list and the mailing lists doesn't say much. The only onething I've found is this survey [1]. What does the results of the survey say?Is there any clue what people expect from breeze-viz?
I've heard [2] that ktakagaki (Kenta) can help with breeze-viz and that he hassome ideas. Kenta, can you comment on my post and share your ideas?
My plan is to mimic (more or less) the architecture and API of existinglibraries like matplotlib, because IMO it's better not to reinvent the wheeland make the same mistakes again. Of course, where it will be possible, I willtry to use the power of Scala by i.e. using operator overloading instead ofordinary method names.
I'm completely new to Breeze and I'm not as good at Scala as at Python, so Ipropose to adopt the following strategy:At the very beginning I will work only at writing a draft of breeze-vizdocumentation. Writing documentation will force me to read code accurately andto understand how everything works as well as to predict risks and dangers.This shouldn't take too long.Then, I will start to do small improvements in breeze-viz. After this startupI will refactor code (if necessary), make bigger changes and introduceessential features like.I'm aware this is pretty vague, I will post precise propositions ofimprovements in the weekend.
--
You received this message because you are subscribed to the Google Groups "Scala Breeze" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-breeze...@googlegroups.com.
To post to this group, send email to scala-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/eeeb5872-8396-4987-ad7d-ecb4d4ecb8c4%40googlegroups.com.
So anyway, here are my visualization thoughts, mainly from the bio-scientific user POV. The idea "reinvents" the wheel, as you put it, and would take longer to get off the ground. However, that tradeoff (I think) would allow for much easier expansion in the future as the package matures (see Goal 3 below), and allow us to go far beyond matplotlib/MatLab.Goal 1 (for me)... is to make clean, publishable graphs (i.e. ideally PDF/PS output), in an interactive way.That is where MatLab falls on its face. The default output is 80's-looking and customizability is poor, so most people I know resort to touching up vector output with Illustrator, etc., before submitting figures.
matplotlib looks much more modern, although I haven't published with it myself.
Goal 2... to make layered/tiled graphics with multiple elementsFor real life use as a scientist, simple line graphs and bar charts are often not enough. See [Wilkinson](http://www.cs.uic.edu/~wilkinson/TheGrammarOfGraphics/GOG.html), [ggplot examples](https://www.google.de/search?q=ggplot2+examples), [Mathematica examples](http://reference.wolfram.com/mathematica/guide/GraphicsOptionsAndStyling.html)In terms of layering and composing graphs, it is important to choose the coordinate system very carefully, and be in control of this. The MatLab axis concept causes severe headaches after the 2nd or 3rd graph element, with a lot of hand coding of layouts. It does seem that the matplot lib people have improved the situation, albeit slightly (http://matplotlib.org/gallery)
Goal 3... to make custom graphics and plotsThis is where (if you share this goal), it may be particularly unwise to stick to JFreeChart as a backend (i.e. breeze.viz). (Furthermore, JFreeChart is swing-based, but I think java is moving towards JavaFX.)In order to make custom plots, I think one needs a more systematic representation of primitives, (eg)[http://reference.wolfram.com/mathematica/howto/CombineTwoOrMoreGraphics.html]If I were forced to do this myself today (gun to my head) without further discussion, I would turn to ScalaFX/JavaFX to create our own graphics primitive/plot hierarchy from the ground up, modelled around (Mathematica graphics/plots)[https://reference.wolfram.com/mathematica/tutorial/TheStructureOfGraphics.html], but with a more OOP and 21st century flair. A big advantage of this approach is that it can be expanded a lot in the future----3D plots and animations (JavaFX), which are becoming pretty bread and butter in my field, plotting functions (not data, eg plot( sin(_) ); I think breeze.viz already does this), and (dynamically interactive plots)[https://reference.wolfram.com/mathematica/guide/InteractiveManipulation.html], which I find very useful to quickly scan through large datasets.
Goal 4... To have sane default values, but to be able to specify every option of the graph in detail.Matlab (and matplotlib) make extensive use of "nargin"-type parameters, many of which are text string values. Since Scala is a typed language, our only alternative to follow this syntax is to pass text strings exclusively and parse them at runtime, or to limit parameters to a single type, which is very often unreasonable.Based on discussions in the past in the breeze group (https://groups.google.com/forum/#!topic/scala-breeze/o7A49ZYP1kg, https://github.com/scalanlp/breeze/pull/115, https://groups.google.com/forum/#!topic/scala-breeze/IcZxSOq6Fr8), I have chosen (case class/case object based options)[https://github.com/scalanlp/breeze/blob/master/src/main/scala/breeze/signal/options.scala] for a similar problem in the breeze.signal package. This has the benefit that you can pass options that encapsulate different value types, for example, OptWIndow.Automatic, OptWindow.Hamming(a, b), .... The objects are also compiled, so no parsing at runtime.I have a feeling this would also work well for graphics options (OptColor.Black, OptColor.Hue(h, s, b, alpha), OptColor.RGB(r, g, b), OptColor.ColorMap( ColorMapHeat ), OptColor.Automatic, ....). You could also pass object options after display, to actively modify existing graphics.
Goal 5... slightly unrelated, but to have an iPython notebook/Mathematica notebook type REPL interface, which records commands, output and graphics output. See (https://github.com/Bridgewater/scala-notebook)
--
You received this message because you are subscribed to the Google Groups "Scala Breeze" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-breeze...@googlegroups.com.
To post to this group, send email to scala-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/c7d4a16a-8a6c-4ddc-ad03-1ef131aea5c5%40googlegroups.com.
# the histogram of the datan, bins, patches = matplotlib.pyplot.hist(x, num_bins, normed=1, facecolor='green', alpha=0.5)# add a 'best fit' liney = matplotlib.mlab.normpdf(bins, mu, sigma)plt.plot(bins, y, 'r--')plt.xlabel('Smarts')plt.ylabel('Probability')plt.title(r'Histogram of IQ: $\mu=100$, $\sigma=15$')
--
You received this message because you are subscribed to a topic in the Google Groups "Scala Breeze" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-breeze/GcrhfKJHUEw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scala-breeze...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/CALW2ey3hCBk2JOB90vpYLs1qO3Xm1kRTtUYkhEeF_jfoAm4XcA%40mail.gmail.com.
# the histogram of the datan, bins, patches = matplotlib.pyplot.hist(x, num_bins, normed=1, facecolor='green', alpha=0.5)# add a 'best fit' liney = matplotlib.mlab.normpdf(bins, mu, sigma)plt.plot(bins, y, 'r--')plt.xlabel('Smarts')plt.ylabel('Probability')plt.title(r'Histogram of IQ: $\mu=100$, $\sigma=15$')
--
You received this message because you are subscribed to a topic in the Google Groups "Scala Breeze" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-breeze/GcrhfKJHUEw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scala-breeze...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/CALW2ey3hCBk2JOB90vpYLs1qO3Xm1kRTtUYkhEeF_jfoAm4XcA%40mail.gmail.com.
On Friday, February 28, 2014 10:01:34 AM UTC+1, David Hall wrote:To be honest, I'm completely new to breeze and I don't know whichideas/projects are worth much and which doesn't introduce much value as wellas which projects are easy and which are hard. So in this issue I have to relyon you.That's fair. I want to find something that interests you, to be sure. Things that I would like to have happen in the near and/or long term, in no particular order:0) A good interactive-ish visualization library (that is, can pop up a window, not just generate graphics)1) GPUs (I'm starting to work on this already)2) NumPy parity: Besides ndarrays, this is basically just fleshing out a few functions.3) Something like Pandas (/ annexing Saddle)4) pretty much anything in SciPy5) Integrating algebraic hierarchy from Spire or Algebird6) Symbolic mathI filtered out projects no 1, 2, 3 and 5, because they are hard or requirereally good knowledge of Scala and its type system or require good knowledgeof some libraries (i.e. Saddle) which I lack.
So I fished out projects (0) breeze-viz, (4) anything from SciPy and (6)symbolic math. I find all three projects suitable for me, because they requiresuperb comprehension of neither breeze internals nor Scala type systems aswell as because these project are about building a new layer based on otherlayers, which is much easier than tampering with existing layers.
At the beginning I will focus on writing documentation for existing moduleswithout enhancing them. This could be quite beneficial since everybody wantsdocumentation. This is also a chance for me to get into breeze internals aswell as to better understand Scala and proper use of its features.After this setup, I will implement the new modules. There are many modulesinside SciPy. Which features should attract more attention?1) clustering2) integration3) interpolation4) signal processing: B-splines, filtering and so on5) graph routines and data structures6) enhancing optimization7) enhancing statistical functionsIMO (2) integration and (3) interpolation are the most important, but I wouldlike to know your opinion.
We will start with interpolation (if this is a desired feature) since it's theeasiest topic for me (last term I attended a subject that treated aboutinterpolation among other numerical algorithms).
I will implement linear and spline interpolator as well as design an interfaceof all univariate interpolators. Tests and documentation also will be written.BTW, I will use test-driven or documentation-driven methodology, so that Iwill start from writing tests or documentation. Then, I will publishtests/docs so you could make an opinion and give me feedback about API beforeimplementation.
After that, integration will get attention. Again, I will implement only onemethod of integration (let it be integration of univariate function usingtrapezoid method) and provide rich documentation.
Before the coding period starts, I could implement one small module (i.e.linear univariate interpolation) as a proof that I know Scala good enough tomanage this project and to get started with tools I will use during the GSoC(I mean sbt and markdown).
--
You received this message because you are subscribed to the Google Groups "Scala Breeze" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-breeze...@googlegroups.com.
To post to this group, send email to scala-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/0a531c7c-d7ea-495e-b104-07667d461ec7%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Scala Breeze" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-breeze...@googlegroups.com.
To post to this group, send email to scala-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/ac60eead-6b72-4b41-98c5-5ab371432340%40googlegroups.com.
Hi Chris and David,I'm not formally involved with this GSoC project, but I've also left a quick comment / question on the first draft of linear interpolation.To summarise my question for the benefit of the mailing list: what typeclass should interpolation operate on?
In the past, I have required linear interpolation for both vector / tensor spaces and for scalars. Mathematically, interpolation requires the same operations on both types: multiplication by a scalar and addition. However, the operation for scalar multiplication is encoded differently for field / scalar types than it is for vector / tensor spaces. It's my understanding (please correct me if I'm wrong here) that for SemiRings, scalar multiplication is typically done by promoting a scalar to the SemiRing and then performing a SemiRing multiplication operation, whereas for vector / tensor spaces, the scalar multiplication is an explicit operation (mulVS, or OpMulScalar). Is there a simple way around these differences, unifying the types to allow the operations to be done in the same way?
I've also encountered a similar problem with FIR and IIR filtering of signals: it should be possible to write the filter identically for scalar and vector space types, but there seems to be a mismatch due to the way that scalar multiplication is represented.
Please let me know if I'm missing something obvious. This might not actually be an issue; it could be that I am currently mis-using the typeclasses somehow. :-)Thanks,Jonathan Merritt.
--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
OK, I've commented on the proposal. I've also written first draft of linearinterpolation [1]. Please have a look at it.
Do I need to post my proposal to scala-language too? Who does decide whichstudents will be accepted? And how can I improve chances of being accepted?
BTW I think that it'd be better to include documentation into repo.Unfortunately, it's not easy to integrate with github wiki pages -- each wikipage is a separated repo. However, documentation could be hosted onhttp://www.scalanlp.org/. Of course, this is only an idea, we don't have to doit. What do you think about that?
--
You received this message because you are subscribed to the Google Groups "Scala Breeze" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-breeze...@googlegroups.com.
To post to this group, send email to scala-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/64de0e31-a58a-4629-a319-73fcf0da6be3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Sorry for the delay. Been super busy.kOn Sun, Mar 9, 2014 at 9:11 AM, Christopher Medrela <chris....@gmail.com> wrote:
OK, I've commented on the proposal. I've also written first draft of linearinterpolation [1]. Please have a look at it.It looks pretty good! We'll tweak it, but it's a good start.
Do I need to post my proposal to scala-language too? Who does decide whichstudents will be accepted? And how can I improve chances of being accepted?Yeah, I think they want that.BTW I think that it'd be better to include documentation into repo.Unfortunately, it's not easy to integrate with github wiki pages -- each wikipage is a separated repo. However, documentation could be hosted onhttp://www.scalanlp.org/. Of course, this is only an idea, we don't have to doit. What do you think about that?Each wiki page is a single file, yes? g...@github.com:scalanlp/breeze.wiki.git (We can do a git submodule if you want.)I think that's the best place for it.
On Thursday, March 13, 2014 7:01:01 PM UTC+1, David Hall wrote:Sorry for the delay. Been super busy.kOn Sun, Mar 9, 2014 at 9:11 AM, Christopher Medrela <chris....@gmail.com> wrote:
OK, I've commented on the proposal. I've also written first draft of linearinterpolation [1]. Please have a look at it.It looks pretty good! We'll tweak it, but it's a good start.I improved the code. Is there any way to avoid repeating all arguments inLinearInterpolator constructor?
Do I need to post my proposal to scala-language too? Who does decide whichstudents will be accepted? And how can I improve chances of being accepted?Yeah, I think they want that.BTW I think that it'd be better to include documentation into repo.Unfortunately, it's not easy to integrate with github wiki pages -- each wikipage is a separated repo. However, documentation could be hosted onhttp://www.scalanlp.org/. Of course, this is only an idea, we don't have to doit. What do you think about that?Each wiki page is a single file, yes? g...@github.com:scalanlp/breeze.wiki.git (We can do a git submodule if you want.)
I think that's the best place for it.The problem with docs in submodules is that documentation commits are not
associated with code commits. If we had documentation in the same repository,then there would be no problem like "which breeze version does this documentationdescribe?". Today I discovered Github Pages. It supports Jekyll, so it cangenerate fancy looking pages from Markdown docs! Maybe we could give it odds?
--
You received this message because you are subscribed to the Google Groups "Scala Breeze" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-breeze...@googlegroups.com.
To post to this group, send email to scala-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/22216be0-a755-4c7d-b433-2764d0a26397%40googlegroups.com.
On Fri, Mar 14, 2014 at 8:21 AM, Christopher Medrela <chris....@gmail.com> wrote:BTW I think that it'd be better to include documentation into repo.Unfortunately, it's not easy to integrate with github wiki pages -- each wikipage is a separated repo. However, documentation could be hosted onhttp://www.scalanlp.org/. Of course, this is only an idea, we don't have to doit. What do you think about that?Each wiki page is a single file, yes? g...@github.com:scalanlp/breeze.wiki.git (We can do a git submodule if you want.)
I think that's the best place for it.The problem with docs in submodules is that documentation commits are not
associated with code commits. If we had documentation in the same repository,then there would be no problem like "which breeze version does this documentationdescribe?". Today I discovered Github Pages. It supports Jekyll, so it cangenerate fancy looking pages from Markdown docs! Maybe we could give it odds?Aren't they? Isn't a git submodule stored as a particular commit to that repo? So "tag releases/v0.6's doc submodule is to commit 1a2b3c4d5e"
I have a fantasy that involves writing an inverse doctest where code snippets in markdown docs are treated as tests. I don't think it would take that long. Maybe I should just do it.