This may not be a popular opinion, but I feel any honest conversation
about making the source accessible for would-be contributors needs to
discuss the trend of increasing complexity in the code base. The
unpleasant reality is Arduino's internal code complexity has massively
increased over the last few years, to the point where it's nearly
impossible for new contributors to get started with the IDE. If
rearranging folders and files helps at all, it'll make only a slight
improvement.
I've followed the IDE development closely for 7 years. During that
time, I've noticed 4 eras of complexity.
1: Long ago, nearly all data was manged by the Processing preferences,
which looks up a value for string keys. Keys were made by string
concatenation, with different systems prefixing their names. This was
very easy for would-be contributors to understand, especially because
the strings matched up to the contents of preferences.txt. The code was
usually compact and self-contained, which is easy to understand when
getting started. Arduino didn't support more than 1 boards.txt, but
multiple boards and cores were supported.
2: When multiple hardware directories and separate boards.txt files were
supported, the code base started transitioning to Java maps. The
concept of a single hierarchy was replaced by maps for targets, which
mapped to boards, and then to cores, and eventually to variants.
Simple, easy to understand data types like "String" were replaced with
types like "Map<String, Integer>" or "Map<String, Target>" or even
"Map<String, Map<String, String>>", which makes the code quite difficult
to read. As these Map types became pervasively used as types for
function inputs and return values, and lots of different and separate
Map structures were used for many different purposes, Arduino lost the
simplicity of a text-based file which corresponded to recognizable stuff
in the code base. It became necessary to keep a mental model of what
all these various maps did, and be able to deal with the much more
complex Java syntax these maps require.
3: In the 1.5 development, as more and more maps were added, the third
era of Java class abstractions began. Nearly all these map types were
converted to Java classes. Some add little more than giving the map a
name, but many do add code. Many aren't maps, but encapsulate a small
amount of data. A tremendous number of these exist now.
LibrariesIndex, LibrariesIndexer, ContributedLibrary,
LibraryWithNamePredicate, LibraryInstalledInsideCore,
ContributionsIndex, ContributionsIndexer, ContributedTargetPackage,
ContributedPlatform, PlatformArchitecturePredicate,
HostDependentDownloadableContribution are just some you might try
reading to figure out how Arduino 1.6.x does something with managing
libraries. Almost none of them have comments explaining their purpose
or rationale. For example, you might look at ContributedTargetPlatform,
only to discover it has almost no code (and no comments explaining what
its purpose is), but it inherits from LegacyTargetPlatform. Then you
look at LegacyTargetPlatform, which does have lots of code (but again no
explanation of what it is or does). It does have a couple Map lists,
and it inherits from TargetPlatform, which appears to be an abstract
class. Most of the functions return Map types. Every one of these
classes was probably created with some good reason, but there are so
many of them, and in so many cases they don't encapsulate functionality
like you'd expect an object oriented program to do, but rather provide
references to Map types, which makes the code very complex and difficult
to understand.
4: Starting with 1.6.6, we now seem to be moving into a new era where a
good portion of Arduino's functionality has moved to a separate code
base, in Google Go language. Now a contributor needs to not only
understand Java and the finer points of complex Map types in Java, but
now Go language as well. The new builder code continues the tradition
of a large number of classes to understand (many with only small amounts
of code), with little or no documentation about their purpose or
rationale. Even the program's structure takes these abstraction levels
to new heights, with separate objects to collect up variables, which are
passed from object to object, making it quite difficult to follow what
data is actually input and output by each part of the program. Even the
order running functions in sequence is done by an abstraction layer of
lists of function references, rather than normal code that simply calls
the functions. Having two difficult code bases to learn in 2 languages
really raises the bar for anyone to learn the code well enough to do
something really useful.
My long-winded point is Arduino's code base has grown very, very
complex. Admittedly, the IDE has become very configurable. But still,
the trend towards more and more classes and abstraction layers, and the
trend to pass complex data sets around, use of complex types & syntax,
and now 2 languages... it all makes the code base astronomically more
difficult for would-be contributors to come up to speed and do anything
productive.
Rearranging files and folders might help a tiny bit, but doing so is
really ignoring the reality that the code itself has become very complex
and unapproachable, regardless of how the files are arranged.
If you really want to make the code base more approachable, work needs
to be done to document the purpose and rationale behind the many
classes. Future development should consider the cost in complexity of
adding more abstraction layers. Ideally, more object oriented design
principles could be adopted, to encapsulate data and functionality
within objects, rather than provide public functions which effectively
make all the object's private data public by providing Map references.
Effort should be made to avoid very complex Java & Go syntax, using
simple types for function inputs and returns. Above all else, there
really needs to be documentation created to explain what these many news
classes do, and why they do it, and how each fits into the overall
working of Arduino.