YARD 0.8.0 Development Plans

Loren Segal

unread,

Jan 12, 2012, 2:57:08 AM1/12/12

to yar...@googlegroups.com

Hi all,

It's 2012, and work for YARD 0.8.0 is underway. There are some major
changes coming for the next major release. Among them are:

* Note: all ticket numbers can be referenced off of
https://github.com/lsegal/yard/issues
<https://github.com/lsegal/yard/issues?sort=created&direction=desc&state=open&page=1&milestone=15>

- Internationalization, see #395 as a starter
- Directive tag syntax (for behavioural tags), see #453
- A rewritten C parser and integration into the handler arch, see #454
- --embed-mixins to embed mixin methods into a class documentation page,
see #450
- Improved macro support (multiple tickets, but mostly dependent on
directive syntax in #453)

There are also various small additions and fixes to be made. The
[incomplete] list of things that need to get done before release is on
Github's issues page:

https://github.com/lsegal/yard/issues?sort=created&direction=desc&state=open&page=1&milestone=15
<https://github.com/lsegal/yard/issues?sort=created&direction=desc&state=open&page=1&milestone=15>

Some things are missing because they were forgotten or not added to the
issues page when reported. Some items will be pushed to a future
release. Some items have mostly been implemented (the C parser, for one,
embed-mixins is almost done) but have not yet been finalized. I'll be
curating the list over the coming days. I want to lock down the feature
additions before the end of this month, so if you have suggestions for
0.8.0, please create a new issue ASAP.

By far the largest feature that needs to be implemented right now is the
new directive syntax. https://github.com/lsegal/yard/issues/453 This
feature will require fairly extensive changes to a number of internal
components and has some quite complex design considerations. It is hard
to split this work up into multiple subcomponents, as they all depend on
one another, and, because of the compatibility issues, still needs some
finalizing of the final API. Once this feature is complete, we will be
much closer to knowing where we stand for release. The goal is to have
an implementation plan for this feature in particular before the end of
the month.

Some other things to mention:

1. The target release month for 0.8.0 will be March, hopefully
mid-month. But this is a loose schedule and may change (possibly
earlier, possibly later) depending how things go. Note that a 0.7.5
release might be coming either at the end of Jan. or early Feb.,
depending on some pending bug reports (it might not come at all if 0.8.0
is nearly complete). If this release occurs, it will only be minor
bug/stability fixes, not new features (and potentially some bug fix
backports from the 0.8.0-master branch).

2. All development is currently occurring in the 0.8.0-master branch.
This branch will soon be merged back into master, possibly within the
next week or so.

3. Patches are welcome, and if anybody can knock any of the issues of
the list, it would be hugely appreciated!

Thanks all for making YARD awesome.

Loren

Kouhei Sutou

unread,

Jan 16, 2012, 7:55:27 AM1/16/12

to yar...@googlegroups.com

Hi,

In <4F0E9254...@soen.ca>
"[YARD] YARD 0.8.0 Development Plans" on Thu, 12 Jan 2012 02:57:08 -0500,
Loren Segal <lse...@soen.ca> wrote:

> It's 2012, and work for YARD 0.8.0 is underway. There are
> some major changes coming for the next major release. Among
> them are:

...

> - Internationalization, see #395 as a starter

Thanks for including it in major changes! :-)
I've almost done it in my i18n branch:
https://github.com/kou/yard/tree/i18n

But it seems that I should rework on it after 0.8.0-master
is merged into master because my branch changes some core
files and template files. So it seems that I should work on
some works that help you merge 0.8.0-master into master. ;-)

> There are also various small additions and fixes to be
> made. The [incomplete] list of things that need to get done
> before release is on Github's issues page:
>
> https://github.com/lsegal/yard/issues?sort=created&direction=desc&state=open&page=1&milestone=15

Should I check the list and send paches? Or should I wait
for your curation?

> 1. The target release month for 0.8.0 will be March,
> hopefully mid-month. But this is a loose schedule and may
> change (possibly earlier, possibly later) depending how
> things go.

Great!

Thanks,
--
kou

Loren Segal

unread,

Jan 16, 2012, 3:34:20 PM1/16/12

to yar...@googlegroups.com

On 1/16/2012 7:55 AM, Kouhei Sutou wrote:
> Thanks for including it in major changes! :-)
> I've almost done it in my i18n branch:
> https://github.com/kou/yard/tree/i18n

I will take a look!

>
> But it seems that I should rework on it after 0.8.0-master
> is merged into master because my branch changes some core
> files and template files. So it seems that I should work on
> some works that help you merge 0.8.0-master into master. ;-)

I've just merged 0.8.0-master into master. I moved the old master to
0.7-stable, and I will be backporting all the 0.8 issues marked "bug" to
that branch for a 0.7.5 release.

>
>
>> There are also various small additions and fixes to be
>> made. The [incomplete] list of things that need to get done
>> before release is on Github's issues page:
>>
>> https://github.com/lsegal/yard/issues?sort=created&direction=desc&state=open&page=1&milestone=15
> Should I check the list and send paches? Or should I wait
> for your curation?

I did some curation already, so you can check the list. Anything without
a "NeedsFeedback" tag in the issues list can be addressed, especially
those marked "bug".

Loren

Kouhei Sutou

unread,

Jan 17, 2012, 10:37:30 AM1/17/12

to yar...@googlegroups.com

Hi,

In <4F1489CC...@soen.ca>
"Re: [YARD] YARD 0.8.0 Development Plans" on Mon, 16 Jan 2012 15:34:20 -0500,
Loren Segal <lse...@soen.ca> wrote:

> On 1/16/2012 7:55 AM, Kouhei Sutou wrote:
>> Thanks for including it in major changes! :-)
>> I've almost done it in my i18n branch:
>> https://github.com/kou/yard/tree/i18n
> I will take a look!

Thanks!

I added "Internationalization Support" to
https://github.com/kou/yard/blob/i18n/docs/GettingStarted.md
at the bottom.

(There are some TODO marks.)

> I've just merged 0.8.0-master into master. I moved the old
> master to 0.7-stable, and I will be backporting all the 0.8
> issues marked "bug" to that branch for a 0.7.5 release.

Great!

>>> https://github.com/lsegal/yard/issues?sort=created&direction=desc&state=open&page=1&milestone=15
>> Should I check the list and send paches? Or should I wait
>> for your curation?
>
> I did some curation already, so you can check the
> list. Anything without a "NeedsFeedback" tag in the issues
> list can be addressed, especially those marked "bug".

OK. I'll check the list.

Thanks,
--
kou

Kouhei Sutou

unread,

Jan 21, 2012, 5:09:02 AM1/21/12

to yar...@googlegroups.com

Hi,

In <20120118.003730.35...@cozmixng.org>
"Re: [YARD] YARD 0.8.0 Development Plans" on Wed, 18 Jan 2012 00:37:30 +0900 (JST),
Kouhei Sutou <k...@cozmixng.org> wrote:

>>>> https://github.com/lsegal/yard/issues?sort=created&direction=desc&state=open&page=1&milestone=15
>>> Should I check the list and send paches? Or should I wait
>>> for your curation?
>>
>> I did some curation already, so you can check the
>> list. Anything without a "NeedsFeedback" tag in the issues
>> list can be addressed, especially those marked "bug".
>
> OK. I'll check the list.

I checked and sended some pull requests for issues that I
can reproduce on my environment. Other issues aren't
reproduced on my environmnet. It seems that they already
have been fixed.

Thanks,
--
kou

Loren Segal

unread,

Jan 21, 2012, 10:31:59 AM1/21/12

to yar...@googlegroups.com

Hey Kou,

I saw the boatload of pulls today, thanks! I'm at a conference right now, but I'll be looking them over tomorrow or Monday. It looks awesome though, so again, thank you very much!

Loren

Kouhei Sutou

unread,

Jan 31, 2012, 7:33:10 AM1/31/12

to yar...@googlegroups.com

Hi,

In <1AE8BF16-35E0-41D4...@soen.ca>
"Re: [YARD] YARD 0.8.0 Development Plans" on Sat, 21 Jan 2012 10:31:59 -0500,
Loren Segal <lse...@soen.ca> wrote:

>>>>>> https://github.com/lsegal/yard/issues?sort=created&direction=desc&state=open&page=1&milestone=15
>>>>> Should I check the list and send paches? Or should I wait
>>>>> for your curation?
>>>>
>>>> I did some curation already, so you can check the
>>>> list. Anything without a "NeedsFeedback" tag in the issues
>>>> list can be addressed, especially those marked "bug".
>>>
>>> OK. I'll check the list.
>>
>> I checked and sended some pull requests for issues that I
>> can reproduce on my environment. Other issues aren't
>> reproduced on my environmnet. It seems that they already
>> have been fixed.
>
> I saw the boatload of pulls today, thanks! I'm at a conference right now, but I'll be looking them over tomorrow or Monday. It looks awesome though, so again, thank you very much!

Thanks for merging my pulls.

How should I work on i18n feature? It is not a small
change. It may be difficult to merge the work into the
master. I want to follow the work flow which you like.

Thanks,
--
kou

Loren Segal

unread,

Jan 31, 2012, 4:31:55 PM1/31/12

to yar...@googlegroups.com

On 1/31/2012 7:33 AM, Kouhei Sutou wrote:
>
> Thanks for merging my pulls.

Np! Thanks for fixing those bugs!

>
> How should I work on i18n feature? It is not a small
> change. It may be difficult to merge the work into the
> master. I want to follow the work flow which you like.

I think the best method is what you are already doing for your other
patches. Just be sure to `git rebase master` prior to pushing-- that way
you should be able to stay more or less in sync.

Loren

Kouhei Sutou

unread,

Feb 2, 2012, 8:10:07 AM2/2/12

to yar...@googlegroups.com

Hi,

In <4F285DCB...@soen.ca>
"Re: [YARD] YARD 0.8.0 Development Plans" on Tue, 31 Jan 2012 16:31:55 -0500,
Loren Segal <lse...@soen.ca> wrote:

>> How should I work on i18n feature? It is not a small
>> change. It may be difficult to merge the work into the
>> master. I want to follow the work flow which you like.
>
> I think the best method is what you are already doing for
> your other patches. Just be sure to `git rebase master`
> prior to pushing-- that way you should be able to stay more
> or less in sync.

OK.

Can I send some pull requests for i18n feature instead of
one pull request? For example, "pot output" is one of pull
requests.

It seems that i18n feature change is too big for reviewing.
If you accept my proposal, I'll create some branches and
sends pull requests.

Thanks,
--
kou

Loren Segal

unread,

Feb 4, 2012, 6:25:44 PM2/4/12

to yar...@googlegroups.com

Hi,

On 2/2/2012 8:10 AM, Kouhei Sutou wrote:
> Can I send some pull requests for i18n feature instead of
> one pull request? For example, "pot output" is one of pull
> requests.

Yes. The more you can split it up the better.

Can you explain what your changes will include, so I know what to
expect? For instance, if you will be replacing hardcoded strings with
something like tr("Hardcoded string"), it would be good to know about
this before the pulls-- that way I can start applying those changes in
any new code I'm writing.

Also, if you are in fact making these underlying changes (like a tr() or
similarly named translation method), can you make those pull requests
*first*? I would merge those in right away, that way we can have the
base infrastructure in master so I can start moving things over as well.

A quick explanation of your design / implementation prior to the pulls
will help us coordinate this better, I think.

Loren

Kouhei Sutou

unread,

Feb 5, 2012, 5:30:35 AM2/5/12

to yar...@googlegroups.com

Hi,

In <4F2DBE78...@soen.ca>
"Re: [YARD] YARD 0.8.0 Development Plans" on Sat, 04 Feb 2012 18:25:44 -0500,
Loren Segal <lse...@soen.ca> wrote:

> Can you explain what your changes will include, so I know
> what to expect? For instance, if you will be replacing
> hardcoded strings with something like tr("Hardcoded
> string"), it would be good to know about this before the
> pulls-- that way I can start applying those changes in any
> new code I'm writing.

OK. It's reasonable.

Here are breaked down features for i18n feature:

For document (including YARD's document):
1. extract text to be translated from docstring.
2. generate .pot file for extracted text.
3. apply transation to docstring.
4. provide a Rake task that extracts text and merges
extracted text for easy to maintain.
5. document "how to create i18n supported document".

For YARD:
6. extract text to be translated from template.
7. extract text to be translated from lib/yard/**/*.rb.
8. generate .pot file for extracted text.
9. apply translation to template.

'tr("Hardcoded string")' is corresponding to 9. In my work,
translation method is named "_" because gettext uses
"_". For example _("Hardcoded string").

Gettext also uses "N_" that is just a mark for text
extractor. It just returns passed string. Here is an
implementation of N_ in Ruby:

def N_(message)
message
end

Text extractor parses source and template files statically
and find texts to be translated.

> Also, if you are in fact making these underlying changes
> (like a tr() or similarly named translation method), can you
> make those pull requests *first*? I would merge those in
> right away, that way we can have the base infrastructure in
> master so I can start moving things over as well.

I'll send a pull request for 9 first. It'll introduce "_"
and "N_" methods.

> A quick explanation of your design / implementation prior to
> the pulls will help us coordinate this better, I think.

Thanks for your advice!
If you need more explation, I'll answer it. Otherwise, I'll
sends a pull request.

Thanks,
--
kou

Loren Segal

unread,

Feb 5, 2012, 6:16:29 AM2/5/12

to yar...@googlegroups.com

On 2/5/2012 5:30 AM, Kouhei Sutou wrote:
> Hi,
>
> In<4F2DBE78...@soen.ca>
> "Re: [YARD] YARD 0.8.0 Development Plans" on Sat, 04 Feb 2012 18:25:44 -0500,
> Loren Segal<lse...@soen.ca> wrote:
>
>> Can you explain what your changes will include, so I know
>> what to expect? For instance, if you will be replacing
>> hardcoded strings with something like tr("Hardcoded
>> string"), it would be good to know about this before the
>> pulls-- that way I can start applying those changes in any
>> new code I'm writing.
> OK. It's reasonable.
>
> Here are breaked down features for i18n feature:
>
> For document (including YARD's document):
> 1. extract text to be translated from docstring.
> 2. generate .pot file for extracted text.
> 3. apply transation to docstring.
> 4. provide a Rake task that extracts text and merges
> extracted text for easy to maintain.
> 5. document "how to create i18n supported document".

By document you mean documentation, yes? So from what I understand, the
goal is two-fold:

1. Create tools that helps documentation authors create .pot files via a
Rake task
2. Apply these tools on YARD's own docs so that we can start
internationalizing YARD docs

If this is accurate,

1. Can you explain a little bit about the Rake task? Why not a YARD
command, for example? Like, `yard i18n [--extract|--merge] <files>`. We
could perhaps associate a Rake task that runs this CLI command like the
YardocTask, for convenience.

2. It's one thing to extract the strings into .pot files for YARD docs,
but it's a completely different thing to actually apply translations. We
could use the tools to extract the docs, but we probably won't see any
translations any time soon. So I'm not sure if this step will be of much
use just yet.

>
> For YARD:
> 6. extract text to be translated from template.
> 7. extract text to be translated from lib/yard/**/*.rb.
> 8. generate .pot file for extracted text.
> 9. apply translation to template.
>
> 'tr("Hardcoded string")' is corresponding to 9. In my work,
> translation method is named "_" because gettext uses
> "_". For example _("Hardcoded string").
>
> Gettext also uses "N_" that is just a mark for text
> extractor. It just returns passed string. Here is an
> implementation of N_ in Ruby:
>
> def N_(message)
> message
> end

_("string") is fine, I figured it might be that.

Regarding 6 and 7-- is the plan to automate this? If so, are there any
tools to do automate extraction? I guess I'm asking because-- once the
_() and N_() methods are in master, should I start using _() in any new
code, or will some automation tool be doing this at a later stage? FYI,
we tend to avoid the use of strings for internal stuff (we use symbols
to represent keys), so it should actually be fairly effective to
automate the search for strings in source and rewrite them. I don't know
if this was the actual plan though. Doing it manually might be pretty
intensive, though I suppose it sucks either way.

>> A quick explanation of your design / implementation prior to
>> the pulls will help us coordinate this better, I think.
> Thanks for your advice!
> If you need more explation, I'll answer it. Otherwise, I'll
> sends a pull request.

A couple of general design questions--

I would basically want to know, from a documentation writer's
perspective, how exactly will YARD be storing and using this data? For
example, we will have tools to extract the .pot files `yard i18n` or a
rake task), and then the user will fill in those pot files with
translations. Right? So a few questions:

0. (Just to be sure) the .pot files are the "final" files right? Will
they need to run some tool against these .pot files to turn them into
something else? Or will YARD just do that step internally? I'm not *too*
familiar with gettext, but I vaguely know of .po and .mo-- that's about
all I know.

1. Where would these .pot files be stored (or whatever final files that
are created)? Would there be some conventional location to store them?
This is important for the last question...

2. How will this work from the runtime API perspective? Remember, YARD's
plugin/extensibility support is an important part of the project. For
instance, if someone loads up the Registry (YARD::Registry.load!), grabs
an object (o = Registry.at('YARD::CodeObjects::Base')) and asks for the
docstring (o.docstring), they'll get the docstring. Should we be
autotranslating the docstring into their language? Or should we make
them use _(o.docstring) to translate on their own? Perhaps we should
have a o.docstring.translated to make this more obvious to users trying
to write plugins, if we don't auto-translate.

3. Regarding the runtime API again-- a user already typically loads the
registry via Registry.load. But, if .pot files are external to the
registry, they will ALSO have to load up the translations in another
command for something like o.docstring.translated to work. Is this
handled by gettext as well? Could we theoretically compile the
translations and drop them into the `.yardoc` db directory? That way
they would be associated with the registry and could be loaded when the
Registry is loaded in a single command. If the translations are outside
the .yardoc directory, it would be more difficult to load them together
(you would have to specify both paths). Note that if my assumption about
question 0 is wrong, and .pot files are not the final files, it would
actually make sense to "compile" these translations into the .yardoc db
alongside the registry, but again, I don't know exactly how this works.

I'm just wondering if you've given these any thought. If not, don't
worry, I don't need answers to these just yet, as long as we're thinking
about how to answer these questions before the release.

Regards,

Loren

Kouhei Sutou

unread,

Feb 5, 2012, 9:21:47 AM2/5/12

to yar...@googlegroups.com

Hi,

In <4F2E650...@soen.ca>
"Re: [YARD] YARD 0.8.0 Development Plans" on Sun, 05 Feb 2012 06:16:29 -0500,
Loren Segal <lse...@soen.ca> wrote:

>> For document (including YARD's document):
>> 1. extract text to be translated from docstring.
>> 2. generate .pot file for extracted text.
>> 3. apply transation to docstring.
>> 4. provide a Rake task that extracts text and merges
>> extracted text for easy to maintain.
>> 5. document "how to create i18n supported document".
>
> By document you mean documentation, yes?

Yes. I meant documentation. Sorry.

> So from what I
> understand, the goal is two-fold:
>
> 1. Create tools that helps documentation authors create .pot
> files via a Rake task
> 2. Apply these tools on YARD's own docs so that we can start
> internationalizing YARD docs

Yes. It's accurate.

> If this is accurate,
>
> 1. Can you explain a little bit about the Rake task? Why not
> a YARD command, for example? Like, `yard i18n
> [--extract|--merge] <files>`. We could perhaps associate a
> Rake task that runs this CLI command like the YardocTask,
> for convenience.

For extraction, it's OK for implementing as a YARD command.
In my branch, it's `yard doc --format pot`.

For merging, I want to use external tool `msgmerge`. I
suggest that we use existing tools as many as possible to
share better solutions other projects. It's one of reasons
why I suggest that we use gettext based system. Gettext
system has many tools. `msgmerge` is one of the existing
tools. (There is `rmsgmerge` tool that is implemented by
Ruby.)

(I should explain about workflow on gettext system. I'll do
at the below.)

> 2. It's one thing to extract the strings into .pot files for
> YARD docs, but it's a completely different thing to actually
> apply translations. We could use the tools to extract the
> docs, but we probably won't see any translations any time
> soon. So I'm not sure if this step will be of much use just
> yet.

Yuuta Yamada is working on it. :-)
https://github.com/yuutayamada/yard/tree/ja

And his work had been almost done. So we can apply YARD's
i18n feature soon.

>> For YARD:
>> 6. extract text to be translated from template.
>> 7. extract text to be translated from lib/yard/**/*.rb.
>> 8. generate .pot file for extracted text.
>> 9. apply translation to template.
>>
>> 'tr("Hardcoded string")' is corresponding to 9. In my work,
>> translation method is named "_" because gettext uses
>> "_". For example _("Hardcoded string").
>>
>> Gettext also uses "N_" that is just a mark for text
>> extractor. It just returns passed string. Here is an
>> implementation of N_ in Ruby:
>>
>> def N_(message)
>> message
>> end
>
> _("string") is fine, I figured it might be that.

Thanks. We will use _() for translation method name.

> Regarding 6 and 7-- is the plan to automate this?

Yes.

> If so, are
> there any tools to do automate extraction?

There is `rgettext` that is bundled in gettext gem. It can
extract translation target messages from *.rb and *.erb. But
gettext gem isn't maintained yet. So we need some work on
it. I already created a Ripper based text extraction feature
as a YARD handler. But it may be better that text extraction
tool is a separated tool rather than YARD's feature.

> I guess I'm
> asking because-- once the _() and N_() methods are in
> master, should I start using _() in any new code, or will
> some automation tool be doing this at a later stage?

It's the former. There aren't any automation tool.

> FYI, we
> tend to avoid the use of strings for internal stuff (we use
> symbols to represent keys), so it should actually be fairly
> effective to automate the search for strings in source and
> rewrite them. I don't know if this was the actual plan
> though. Doing it manually might be pretty intensive, though
> I suppose it sucks either way.

We only use _() and N_() for label string not key. For
example, define_tag's the first argument is a label string:

lib/yard/tags/library.rb:
define_tag N_("Abstract"), :abstract

Most label strings are in templates. For example,
templates/default/layout/html/objects.erb has a label
string. It will be converted like the following:

before:
<h2>Namespace Listing A-Z</h2>

after:
<h2><%= _("Namespace Listing A-Z") %></h2>

Anyway, I'll do manually works that surround label strings
with _() or N_(). :-)

> I would basically want to know, from a documentation
> writer's perspective, how exactly will YARD be storing and
> using this data? For example, we will have tools to extract
> the .pot files `yard i18n` or a rake task), and then the
> user will fill in those pot files with translations. Right?

It's not right.

Here is an ASCII art about workflow on gettext system:
http://www.gnu.org/software/gettext/manual/gettext.html#Overview

.pot is a PO template file. Documentation writer generated
.po file for each language from .pot file. In the ASCII art,
it shows as "PACKAGE.pot -> msgmerge -> LANG.po".

We use `msginit` for creating the initial LANG.po from
PACKAGE.pot but it's not showed in the ASCII art. (We can
also use `cp` for it. But we need some works after `cp`.)

Documentation writer fills in LANG.po file with translations
by favorite editor. In the ASCII art, it shows as
"LANG.po -> PO editor -> New LANG.po".

> So a few questions:
>
> 0. (Just to be sure) the .pot files are the "final" files
> right? Will they need to run some tool against these .pot
> files to turn them into something else? Or will YARD just do
> that step internally? I'm not *too* familiar with gettext,
> but I vaguely know of .po and .mo-- that's about all I know.

No. .po file is the final file. .po file is generated from
.pot file.

The ASCII art says .mo file is the final file as
"New LANG.po -> msgfmt -> LANG.gmo -> install -> /.../LANG/PACKAGE.mo"
but we can do it internally. So document writer don't care
about .mo file.

> 1. Where would these .pot files be stored (or whatever final
> files that are created)? Would there be some conventional
> location to store them? This is important for the last
> question...

po/#{LANG}.po is a conventional location. In my branch,
locale/#{LANG}/#{PACKAGE}.po is used.

> 2. How will this work from the runtime API perspective?
> Remember, YARD's plugin/extensibility support is an
> important part of the project. For instance, if someone
> loads up the Registry (YARD::Registry.load!), grabs an
> object (o = Registry.at('YARD::CodeObjects::Base')) and asks
> for the docstring (o.docstring), they'll get the
> docstring. Should we be autotranslating the docstring into
> their language? Or should we make them use _(o.docstring) to
> translate on their own? Perhaps we should have a
> o.docstring.translated to make this more obvious to users
> trying to write plugins, if we don't auto-translate.

We don't provide auto-translatable docstring. In my branch,
o.docstring and o.docustring.to_s return non translated text
but o.docstring.document (new method) and
o.docstring.summary return translated text.

> 3. Regarding the runtime API again-- a user already
> typically loads the registry via Registry.load. But, if .pot
> files are external to the registry, they will ALSO have to
> load up the translations in another command for something
> like o.docstring.translated to work. Is this handled by
> gettext as well?

It means that we need to handle tanslations for one or more
projects at once. Right? For example, we may process YARD's
documentation and RSpec's documenation at once.

I didn't care about it. We can load many .po files and use
them separatedly but we need some works for it.

> Could we theoretically compile the
> translations and drop them into the `.yardoc` db directory?

Yes. We can compile .po file to .mo file and loads .mo file.
But we need per project translation mechanism. For example,
we support uses YARD's .mo file and RSpec's .mo file
separatedly.

> That way they would be associated with the registry and
> could be loaded when the Registry is loaded in a single
> command. If the translations are outside the .yardoc
> directory, it would be more difficult to load them together
> (you would have to specify both paths). Note that if my
> assumption about question 0 is wrong, and .pot files are not
> the final files, it would actually make sense to "compile"
> these translations into the .yardoc db alongside the
> registry, but again, I don't know exactly how this works.

I used separated load path mechanism. I also think that
compiled file (.mo file) is put into .yardoc directory when
.yardoc directory is created.

> I'm just wondering if you've given these any thought. If
> not, don't worry, I don't need answers to these just yet, as
> long as we're thinking about how to answer these questions
> before the release.

Thanks for sending questions. They are very helpful because
I didn't know about what should I explain.

Thanks,
--
kou

Loren Segal

unread,

Feb 5, 2012, 3:57:37 PM2/5/12

to yar...@googlegroups.com

Okay this makes things clearer. Perhaps we could still benefit from a
separate i18n command, if only to make it easier to document clearly to
users. Users will probably not realize to use `yard doc --format pot`,
but we can wrap that into an i18n command that does this, and then it
shows up clearly in the `yard help` command listing, something like:

i18n Extracts strings from documentation for internationalization

>
>> 2. It's one thing to extract the strings into .pot files for
>> YARD docs, but it's a completely different thing to actually
>> apply translations. We could use the tools to extract the
>> docs, but we probably won't see any translations any time
>> soon. So I'm not sure if this step will be of much use just
>> yet.
> Yuuta Yamada is working on it. :-)
> https://github.com/yuutayamada/yard/tree/ja
>
> And his work had been almost done. So we can apply YARD's
> i18n feature soon.

Wow that is insane! Thank you so much!

>> If so, are there any tools to do automate extraction?
> There is `rgettext` that is bundled in gettext gem. It can
> extract translation target messages from *.rb and *.erb. But
> gettext gem isn't maintained yet. So we need some work on
> it. I already created a Ripper based text extraction feature
> as a YARD handler. But it may be better that text extraction
> tool is a separated tool rather than YARD's feature.

Right, it makes sense as a separate tool. The tool could just be a yard
plugin, of course. yard-rgettext. But you probably don't really need
YARD handlers for this stuff.

> We only use _() and N_() for label string not key. For
> example, define_tag's the first argument is a label string:
>
> lib/yard/tags/library.rb:
> define_tag N_("Abstract"), :abstract

Right, I figured as much, that is why I mentioned it would be easy to
automate. We don't have many cases where Strings are used as keys-- for
example, in that tab line above, :abstract is a symbol because it is a
key... we tend to follow that convention fairly consistently, so we
might actually benefit from a better rgettext tool.

> Most label strings are in templates. For example,
> templates/default/layout/html/objects.erb has a label
> string. It will be converted like the following:
>
> before:
> <h2>Namespace Listing A-Z</h2>
>
> after:
> <h2><%= _("Namespace Listing A-Z") %></h2>
>
> Anyway, I'll do manually works that surround label strings
> with _() or N_(). :-)

Perhaps we could use something like nokogiri to automate this? Text
nodes are pretty easy to find in an X(HT)ML document. Though the ERB
might confuse the parser.

Automation is important because (a) we want to make it easy for template
customizers to make i18n-friendly templates, and (b) if we expand YARD's
own templates, we don't want to be hampered by the extra work involved
in i18nizing everything-- the more tools can help the better.

If you worked on tools to do this more easily instead of the time spent
painstakingly replacing all those instances manually, we end up with a
system that's easier to use in the future. I don't know how much work
such a tool would be though, so this is just food for thought.

>
>
>> I would basically want to know, from a documentation
>> writer's perspective, how exactly will YARD be storing and
>> using this data? For example, we will have tools to extract
>> the .pot files `yard i18n` or a rake task), and then the
>> user will fill in those pot files with translations. Right?
> It's not right.
>
> Here is an ASCII art about workflow on gettext system:
> http://www.gnu.org/software/gettext/manual/gettext.html#Overview
>
> .pot is a PO template file. Documentation writer generated
> .po file for each language from .pot file. In the ASCII art,
> it shows as "PACKAGE.pot -> msgmerge -> LANG.po".
>
> We use `msginit` for creating the initial LANG.po from
> PACKAGE.pot but it's not showed in the ASCII art. (We can
> also use `cp` for it. But we need some works after `cp`.)
>
> Documentation writer fills in LANG.po file with translations
> by favorite editor. In the ASCII art, it shows as
> "LANG.po -> PO editor -> New LANG.po".

Thanks for the explanation and link. I will read more about this in the
coming week so I'm up to speed on what is being done!

Perhaps you can answer this before I find it in the docs though, but why
is there a .pot and .po, if .mo is the final stage? Could we not just
automate this stage from .pot into .po as well? Forgive my ignorance if
that is a stupid question-- I'm just trying to make this as easy on our
users as possible, so the more things YARD can automate, the better.

>> So a few questions:
>>
>> 0. (Just to be sure) the .pot files are the "final" files
>> right? Will they need to run some tool against these .pot
>> files to turn them into something else? Or will YARD just do
>> that step internally? I'm not *too* familiar with gettext,
>> but I vaguely know of .po and .mo-- that's about all I know.
> No. .po file is the final file. .po file is generated from
> .pot file.
>
> The ASCII art says .mo file is the final file as
> "New LANG.po -> msgfmt -> LANG.gmo -> install -> /.../LANG/PACKAGE.mo"
> but we can do it internally. So document writer don't care
> about .mo file.

How exactly does .po get translated into .mo inside of YARD? Do we
depend on any external system tools that the Ruby stdlib does not
provide? I've always wanted to keep the amount of dependencies to a
minimum. Ruby does not ship with any gettext libraries, correct? That's
something we will have to look into.

>> 1. Where would these .pot files be stored (or whatever final
>> files that are created)? Would there be some conventional
>> location to store them? This is important for the last
>> question...
> po/#{LANG}.po is a conventional location. In my branch,
> locale/#{LANG}/#{PACKAGE}.po is used.

So, the .pot files are never stored, they are just an intermediary phase
to .po, which is stored in their source repository, yes?

>
>> 2. How will this work from the runtime API perspective?
>> Remember, YARD's plugin/extensibility support is an
>> important part of the project. For instance, if someone
>> loads up the Registry (YARD::Registry.load!), grabs an
>> object (o = Registry.at('YARD::CodeObjects::Base')) and asks
>> for the docstring (o.docstring), they'll get the
>> docstring. Should we be autotranslating the docstring into
>> their language? Or should we make them use _(o.docstring) to
>> translate on their own? Perhaps we should have a
>> o.docstring.translated to make this more obvious to users
>> trying to write plugins, if we don't auto-translate.
> We don't provide auto-translatable docstring. In my branch,
> o.docstring and o.docustring.to_s return non translated text
> but o.docstring.document (new method) and
> o.docstring.summary return translated text.

Hmm, this might be a little confusing. I will wait until the pull
request is made to look at it fully, though. Indeed, Docstring will be
problematic because it extends String, so that will be an interesting
problem.

>> 3. Regarding the runtime API again-- a user already
>> typically loads the registry via Registry.load. But, if .pot
>> files are external to the registry, they will ALSO have to
>> load up the translations in another command for something
>> like o.docstring.translated to work. Is this handled by
>> gettext as well?
> It means that we need to handle tanslations for one or more
> projects at once. Right? For example, we may process YARD's
> documentation and RSpec's documenation at once.

Yes, this is an example of a concern. For example, rubydoc.info runs
YARD live on the server, and generates HTML for the projects at runtime
inside of a Rack handler using the Server architecture, so it could be
serving many projects in the same process. We run it in separate threads
(which gettext supposedly supports for loading different languages), and
Registry is also local to a thread, so this would work for the most
part. If we had .mo files local to the Registry, then our translation
data would also be thread local, and would pose no problems for a setup
like rubydoc.info. But there might be other issues that I haven't
thought about.

>
> I didn't care about it. We can load many .po files and use
> them separatedly but we need some works for it.

Indeed, don't worry about this just yet. We will review the possible
compatibility issues when we have the initial implementation done. It's
something to keep in mind, though.

>
>> Could we theoretically compile the
>> translations and drop them into the `.yardoc` db directory?
> Yes. We can compile .po file to .mo file and loads .mo file.
> But we need per project translation mechanism. For example,
> we support uses YARD's .mo file and RSpec's .mo file
> separatedly.
>
>> That way they would be associated with the registry and
>> could be loaded when the Registry is loaded in a single
>> command. If the translations are outside the .yardoc
>> directory, it would be more difficult to load them together
>> (you would have to specify both paths). Note that if my
>> assumption about question 0 is wrong, and .pot files are not
>> the final files, it would actually make sense to "compile"
>> these translations into the .yardoc db alongside the
>> registry, but again, I don't know exactly how this works.
> I used separated load path mechanism. I also think that
> compiled file (.mo file) is put into .yardoc directory when
> .yardoc directory is created.

Okay, that will make the API much easier to use if we can do this. I
think we should look at attaching translation data to the Registry,
then-- by creating some new attributes/methods and serializing them with
the YardocSerializer and RegistryStore classes.

>
>> I'm just wondering if you've given these any thought. If
>> not, don't worry, I don't need answers to these just yet, as
>> long as we're thinking about how to answer these questions
>> before the release.
> Thanks for sending questions. They are very helpful because
> I didn't know about what should I explain.

Your explanations are super helpful, thank you!

Loren

Kouhei Sutou

unread,

Feb 7, 2012, 10:32:24 AM2/7/12

to yar...@googlegroups.com

Hi,

In <4F2EED41...@soen.ca>
"Re: [YARD] YARD 0.8.0 Development Plans" on Sun, 05 Feb 2012 15:57:37 -0500,
Loren Segal <lse...@soen.ca> wrote:

> Okay this makes things clearer. Perhaps we could still
> benefit from a separate i18n command, if only to make it
> easier to document clearly to users. Users will probably not
> realize to use `yard doc --format pot`, but we can wrap that
> into an i18n command that does this, and then it shows up
> clearly in the `yard help` command listing, something like:
>
> i18n Extracts strings from documentation for
> internationalization

I agree with it.

>>> 2. It's one thing to extract the strings into .pot files for
>>> YARD docs, but it's a completely different thing to actually
>>> apply translations. We could use the tools to extract the
>>> docs, but we probably won't see any translations any time
>>> soon. So I'm not sure if this step will be of much use just
>>> yet.
>> Yuuta Yamada is working on it. :-)
>> https://github.com/yuutayamada/yard/tree/ja
>>
>> And his work had been almost done. So we can apply YARD's
>> i18n feature soon.
>
> Wow that is insane! Thank you so much!

I'll tell your "thanks" to him. :-)

>>> If so, are there any tools to do automate extraction?
>> There is `rgettext` that is bundled in gettext gem. It can
>> extract translation target messages from *.rb and *.erb. But
>> gettext gem isn't maintained yet. So we need some work on
>> it. I already created a Ripper based text extraction feature
>> as a YARD handler. But it may be better that text extraction
>> tool is a separated tool rather than YARD's feature.
>
> Right, it makes sense as a separate tool. The tool could
> just be a yard plugin, of course. yard-rgettext. But you
> probably don't really need YARD handlers for this stuff.

OK. I'll work on it to provide it as a separate tool.

> > We only use _() and N_() for label string not key. For
> > example, define_tag's the first argument is a label string:
> >
> > lib/yard/tags/library.rb:
> > define_tag N_("Abstract"), :abstract
>
> Right, I figured as much, that is why I mentioned it would
> be easy to automate. We don't have many cases where Strings
> are used as keys-- for example, in that tab line above,
> :abstract is a symbol because it is a key... we tend to
> follow that convention fairly consistently, so we might
> actually benefit from a better rgettext tool.

We can implement a better rgettext tool for YARD by handles
'define_tag' specially. For example, we just use
'define_tag' like the below

define_tag :abstract

and a better rgettext tool extract "Abstract" label from
':abstract' key.

(The tool will be a YARD specific tool.)

>> Most label strings are in templates. For example,
>> templates/default/layout/html/objects.erb has a label
>> string. It will be converted like the following:
>>
>> before:
>> <h2>Namespace Listing A-Z</h2>
>>
>> after:
>> <h2><%= _("Namespace Listing A-Z") %></h2>
>>
>> Anyway, I'll do manually works that surround label strings
>> with _() or N_(). :-)
>
> Perhaps we could use something like nokogiri to automate
> this? Text nodes are pretty easy to find in an X(HT)ML
> document. Though the ERB might confuse the parser.

We can use Nokogiri like tool for it when template is
XHTML document. But we also have templates for text format.
It seems that we can parse compiled ERB as a Ruby source. It
will not be confuse the parser.

> Automation is important because (a) we want to make it easy
> for template customizers to make i18n-friendly templates,
> and (b) if we expand YARD's own templates, we don't want to
> be hampered by the extra work involved in i18nizing
> everything-- the more tools can help the better.
>
> If you worked on tools to do this more easily instead of the
> time spent painstakingly replacing all those instances
> manually, we end up with a system that's easier to use in
> the future. I don't know how much work such a tool would be
> though, so this is just food for thought.

OK. I'll consider about it.

> Perhaps you can answer this before I find it in the docs
> though, but why is there a .pot and .po, if .mo is the final
> stage? Could we not just automate this stage from .pot into
> .po as well?

.mo is for machine not human. .mo is a binary format
file. We cannot edit .mo file directly.

.pot is always auto-generated from scratch. .po is edited by
human and merged with .pot to import new tarnslate target
messages.

source -> .pot (override) <- .pot is a temporary file.

"LANG.po -> PO editor -> New LANG.po" <- translate existing messages.
"PACKAGE.pot -> msgmerge -> LANG.po" <- import new messages from .pot.

Could I answer your question?

> How exactly does .po get translated into .mo inside of YARD?
> Do we depend on any external system tools that the Ruby
> stdlib does not provide?

Yes. We need to use fast_gettext gem.

> I've always wanted to keep the
> amount of dependencies to a minimum. Ruby does not ship with
> any gettext libraries, correct?

Yes. Ruby doesn't ship with any gettext libraries.

>>> 1. Where would these .pot files be stored (or whatever final
>>> files that are created)? Would there be some conventional
>>> location to store them? This is important for the last
>>> question...
>> po/#{LANG}.po is a conventional location. In my branch,
>> locale/#{LANG}/#{PACKAGE}.po is used.
>
> So, the .pot files are never stored, they are just an
> intermediary phase to .po, which is stored in their source
> repository, yes?

Yes.
.pot files aren't stored in their repository.
.po files are stored in their repository.

>>> 2. How will this work from the runtime API perspective?
>>> Remember, YARD's plugin/extensibility support is an
>>> important part of the project. For instance, if someone
>>> loads up the Registry (YARD::Registry.load!), grabs an
>>> object (o = Registry.at('YARD::CodeObjects::Base')) and asks
>>> for the docstring (o.docstring), they'll get the
>>> docstring. Should we be autotranslating the docstring into
>>> their language? Or should we make them use _(o.docstring) to
>>> translate on their own? Perhaps we should have a
>>> o.docstring.translated to make this more obvious to users
>>> trying to write plugins, if we don't auto-translate.
>> We don't provide auto-translatable docstring. In my branch,
>> o.docstring and o.docustring.to_s return non translated text
>> but o.docstring.document (new method) and
>> o.docstring.summary return translated text.
>
> Hmm, this might be a little confusing. I will wait until the
> pull request is made to look at it fully, though. Indeed,
> Docstring will be problematic because it extends String, so
> that will be an interesting problem.

OK. Should I start to send pull requests? Or should I
explain more things?

Here is a todo list I understand:

1. We use _() and N_() for marking translation message in
YARD.

2. `yard i18n` command will be created to extract messages
from source and document files.
2.1. We create a message extraction tool as a separate
tool.

3. We use `yard i18n` command to localize YARD's
documentation. We use Yuuta Yamada's translation for it.

4. We should handle many translations in a process. At
least, we should handle a translation per registory.

5. We stores .mo files into .yardoc/ directory.

Here is a pending list I understand:

1. Do we create a YARD specific rgettext tool?
2. Can we extract messages from template without _() and
N_() marks?
3. Do we depends on fast_gettext?
4. How do we handle docstring's translation?

>>> 3. Regarding the runtime API again-- a user already
>>> typically loads the registry via Registry.load. But, if .pot
>>> files are external to the registry, they will ALSO have to
>>> load up the translations in another command for something
>>> like o.docstring.translated to work. Is this handled by
>>> gettext as well?
>> It means that we need to handle tanslations for one or more
>> projects at once. Right? For example, we may process YARD's
>> documentation and RSpec's documenation at once.
>
> Yes, this is an example of a concern. For example,
> rubydoc.info runs YARD live on the server, and generates
> HTML for the projects at runtime inside of a Rack handler
> using the Server architecture, so it could be serving many
> projects in the same process. We run it in separate threads
> (which gettext supposedly supports for loading different
> languages), and Registry is also local to a thread, so this
> would work for the most part. If we had .mo files local to
> the Registry, then our translation data would also be thread
> local, and would pose no problems for a setup like
> rubydoc.info. But there might be other issues that I haven't
> thought about.

OK. Thanks for your explanation.

>> I didn't care about it. We can load many .po files and use
>> them separatedly but we need some works for it.
>
> Indeed, don't worry about this just yet. We will review the
> possible compatibility issues when we have the initial
> implementation done. It's something to keep in mind, though.

OK.

Thanks,
--
kou

Kouhei Sutou

unread,

Apr 14, 2012, 12:14:20 AM4/14/12

to yar...@googlegroups.com

Hi,

In <20120208.003224.225...@cozmixng.org>
"Re: [YARD] YARD 0.8.0 Development Plans" on Wed, 08 Feb 2012 00:32:24 +0900 (JST),
Kouhei Sutou <k...@cozmixng.org> wrote:

> OK. Should I start to send pull requests? Or should I
> explain more things?
>
> Here is a todo list I understand:
>
> 1. We use _() and N_() for marking translation message in
> YARD.
>
> 2. `yard i18n` command will be created to extract messages
> from source and document files.
> 2.1. We create a message extraction tool as a separate
> tool.
>
> 3. We use `yard i18n` command to localize YARD's
> documentation. We use Yuuta Yamada's translation for it.
>
> 4. We should handle many translations in a process. At
> least, we should handle a translation per registory.
>
> 5. We stores .mo files into .yardoc/ directory.

First, I implemented 2. (2.1 isn't implemented) and updated
pull request https://github.com/lsegal/yard/pull/395 that is
for .pot file output.

We can generate .pot file by the following command:

% yard i18n -o po/yard.pot --yardopts .yardopts_i18n

po/yard.pot file includes extracted messages from source and
document files.

The pull request has .spec for changes, could you confirm
it?

Thanks,
--
kou

Loren Segal

unread,

Apr 14, 2012, 11:04:40 PM4/14/12

to yar...@googlegroups.com, Kouhei Sutou

Hey kou,

I will be looking at this in the next few days. At first glance it looks
very well tested, so that's great!

Loren

Kouhei Sutou

unread,

Apr 18, 2012, 8:32:05 AM4/18/12

to yar...@googlegroups.com

Hi,

In <4F8A3AC8...@soen.ca>
"Re: [YARD] YARD 0.8.0 Development Plans" on Sat, 14 Apr 2012 23:04:40 -0400,
Loren Segal <lse...@soen.ca> wrote:

> I will be looking at this in the next few days. At first glance it
> looks very well tested, so that's great!