Real world file key examples

7 views
Skip to first unread message

Valery Kholodkov

unread,
Apr 15, 2009, 4:53:54 AM4/15/09
to mogile
Greetings!

I'm working on MogileFS client for nginx:

http://github.com/vkholodkov/nginx-mogilefs-module/tree/master

So far it works for me. Now I want to make the module as easy
configurable as possible. At the moment it depend on how people want
to transform requested URIs into mogilefs keys and what do they use as
keys.

So the question is: can you give me few examples of what you use as
file keys?

--
Regards,
Valery Kholodkov

stefan

unread,
Apr 15, 2009, 6:15:19 AM4/15/09
to mog...@googlegroups.com
On Wed, Apr 15, 2009 at 10:53 AM, Valery Kholodkov <val...@grid.net.ru> wrote:

So the question is: can you give me few examples of what you use as
file keys?


Hey !

Happy to help, although I guess it would be best to support whatever mogilefs itself supports: not adding extra limits, and not accepting stuff that mogile wouldn't like. But maybe I didn't get the point :p

So, for user-uploaded resources in our "blog" app (article images and videos, profile pictures, ...) our file keys are numeric IDs, such as 5304641. That ID is the one the app itself uses.
Thumbnails of those images get a key like T5304641-85x85-jpg and video screenshots get a key like S5304641-jpg


Cheers,
Stefan



val...@grid.net.ru

unread,
Apr 15, 2009, 6:47:33 AM4/15/09
to mog...@googlegroups.com

----- "stefan" <deube...@gmail.com> wrote:

> On Wed, Apr 15, 2009 at 10:53 AM, Valery Kholodkov <
> val...@grid.net.ru > wrote:
>
>
>
> So the question is: can you give me few examples of what you use as
> file keys?
>
>
>
> Hey !
>
> Happy to help, although I guess it would be best to support whatever
> mogilefs itself supports: not adding extra limits, and not accepting
> stuff that mogile wouldn't like.

Both of these two are in my plans.

> But maybe I didn't get the point :p

May be I specify: if I want to fetch a file via public URL and file appears to be stored in mogilefs, what is the most convenient pattern for this URL (for user/administrator), considering the fact that I'm implementing the app which needs to extract file key from this public URL?

> So, for user-uploaded resources in our "blog" app (article images and
> videos, profile pictures, ...) our file keys are numeric IDs, such as
> 5304641. That ID is the one the app itself uses.
> Thumbnails of those images get a key like T5304641-85x85-jpg and video
> screenshots get a key like S5304641-jpg

How do you decide what content type to return to the client if file like 5304641 is requested? As I understand, neither tracker nor storage nodes are able to return proper content type?

--
Regards,
Valery Kholodkov

Ask Bjørn Hansen

unread,
Apr 15, 2009, 1:02:35 PM4/15/09
to mog...@googlegroups.com

On Apr 15, 2009, at 3:47, val...@grid.net.ru wrote:

> How do you decide what content type to return to the client if file
> like 5304641 is requested? As I understand, neither tracker nor
> storage nodes are able to return proper content type?

We have some things that go through our main application which stores
a url -> key+content type+(other meta data) mapping for that.

For our "dumb" media server we just use the filename to determine the
content type.


- ask

Valery Kholodkov

unread,
Apr 21, 2009, 5:16:56 AM4/21/09
to mog...@googlegroups.com

Hello!

For those who are interested, I have released the mogilefs module from nginx last week:

http://www.grid.net.ru/nginx/mogilefs.en.html

Your feedback is welcomed.

----- "Valery Kholodkov" <val...@grid.net.ru> wrote:

> Greetings!
>
> I'm working on MogileFS client for nginx:
>
> http://github.com/vkholodkov/nginx-mogilefs-module/tree/master
>
> So far it works for me. Now I want to make the module as easy
> configurable as possible. At the moment it depend on how people want
> to transform requested URIs into mogilefs keys and what do they use
> as
> keys.

--
Regards,
Valery Kholodkov

Timu EREN

unread,
Apr 22, 2009, 2:18:11 AM4/22/09
to mog...@googlegroups.com
Hi valery,

it's great module, probably should be good if we can define one more
than tracker setting in nginx.conf,

mogilefs_trackers {
server 192.168.2.2 domain=domain.com connect_timeout=60
read_timeout=60 send_timeout=60 weight=2;
server 192.168.6.6 domain=domain.com connect_timeout=60
read_timeout=60 send_timeout=60 weight=3;

}

mogilefs_pass {
proxy_pass $mogilefs_path;

}

thanks a lot, i can't do test yet, but it's great module for mogilefs and nginx.

--
Saygılar && İyi Çalışmalar
Timu EREN ( a.k.a selam )

張筱楓

unread,
Apr 22, 2009, 2:29:04 AM4/22/09
to mog...@googlegroups.com
Great news!
Could this module work like perlbal's reproxy feature ?

Eric

Michael Shadle

unread,
Apr 22, 2009, 2:50:28 AM4/22/09
to mog...@googlegroups.com
I believe if Valery can make this use the normal http upstream it will
retry after a timeout etc. and won't retry failed upstreams etc...

Talking to dormando earlier, it sounds like perlbal has a lot of great
proxying capabilities, and I think it would benefit nginx for many
reasons (one of them being for this module)

Valery Kholodkov

unread,
Apr 22, 2009, 3:18:22 AM4/22/09
to mog...@googlegroups.com
It is already possible via named upstreams:

upstream trackers {
server 10.10.10.1:6001;
server 10.10.10.2:6001;
server 10.10.10.3:6001;
}

server {
location /download/ {
mogilefs_tracker trackers;
[...]
}
}

The only thing missing is a directive similar to
memcached_next_upstream, which I'm going to implement in next version.

Valery Kholodkov

unread,
Apr 22, 2009, 3:25:13 AM4/22/09
to mog...@googlegroups.com
I don't know anything about that, would be nice if you point me some info.


--
Regards,
Valery Kholodkov

Michael Shadle

unread,
Apr 22, 2009, 3:35:27 AM4/22/09
to mog...@googlegroups.com
I don't think it is very well documented.

dormando: ping :)

Michael Shadle

unread,
Apr 22, 2009, 3:37:35 AM4/22/09
to mog...@googlegroups.com
yes, but this creates a double configuration. now you have the
upstreams setup in the mogilefs tracker db -and- the nginx conf.

we need it to be in a single location. so you need to fake an upstream
essentially, create the same structure internally inside of nginx but
create it on the fly (if that is possible)

also, that is a very important detail that got lost - you need to
include the tracker address(es)

your module should really only be talking to trackers for info "hey
where is this file" and then when it gets a list back of upstreams,
create that dynamic behind-the-scenes upstream structure. that's at
least how i would probably describe it.

Valery Kholodkov

unread,
Apr 22, 2009, 5:38:44 AM4/22/09
to mog...@googlegroups.com

----- "Michael Shadle" <mik...@gmail.com> wrote:

> yes, but this creates a double configuration. now you have the
> upstreams setup in the mogilefs tracker db -and- the nginx conf.

Which upstreams do you mean here? Trackers or storage nodes?

> we need it to be in a single location. so you need to fake an
> upstream
> essentially, create the same structure internally inside of nginx but
> create it on the fly (if that is possible)
>
> also, that is a very important detail that got lost - you need to
> include the tracker address(es)

There are obviously tracker addresses in the configuration below.

> your module should really only be talking to trackers for info "hey
> where is this file" and then when it gets a list back of upstreams,
> create that dynamic behind-the-scenes upstream structure. that's at
> least how i would probably describe it.

I cannot create behind-the-scenes upstream structure, because at the moment it is unreasonable to not delegate the proxying logic to proxy module. On the other hand, the current nginx API does not allow to flexibly inherit proxy module's functionality.

--
--
Regards,
Valery Kholodkov

Michael Shadle

unread,
Apr 22, 2009, 5:13:06 PM4/22/09
to mog...@googlegroups.com
On Wed, Apr 22, 2009 at 2:38 AM, Valery Kholodkov <val...@grid.net.ru> wrote:

> Which upstreams do you mean here? Trackers or storage nodes?

Trackers should be in nginx.conf
Storage nodes are managed by the trackers, so those should not be
hardcoded/repeated in the nginx.conf. The only communication with the
tracker you're actually doing is asking which storage nodes have the
file that is being requested, so it makes no sense to have to define
this in two places. Especially since the tracker does healthchecking
(I believe) and keeps the nodelist up to date. That is something nginx
should not have to deal with.

> There are obviously tracker addresses in the configuration below.

Okay, sorry. I think when I gave it that quick glance I didn't notice it.

> I cannot create behind-the-scenes upstream structure, because at the moment it is unreasonable to not delegate the proxying logic to proxy module. On the other hand, the current nginx API does not allow to flexibly inherit proxy module's functionality.

I think for mod_mogilefs to be the best it can be, this needs to be
supported. Either that, or mod_mogilefs just does the same reproxying
that an upstream construct would do, just not natively using nginx's
upstream construct.

Essentially you would want to mimic perlbal's functionality, and then
you might win over dormando and he'll push your solution possibly ;)

Valery Kholodkov

unread,
Apr 23, 2009, 4:21:22 AM4/23/09
to mog...@googlegroups.com
Michael Shadle wrote:
> On Wed, Apr 22, 2009 at 2:38 AM, Valery Kholodkov <val...@grid.net.ru> wrote:
>
>> Which upstreams do you mean here? Trackers or storage nodes?
>
> Trackers should be in nginx.conf
> Storage nodes are managed by the trackers, so those should not be
> hardcoded/repeated in the nginx.conf. The only communication with the
> tracker you're actually doing is asking which storage nodes have the
> file that is being requested, so it makes no sense to have to define
> this in two places. Especially since the tracker does healthchecking
> (I believe) and keeps the nodelist up to date. That is something nginx
> should not have to deal with.

That is exactly how things are working in mogilefs module.

>
>> There are obviously tracker addresses in the configuration below.
>
> Okay, sorry. I think when I gave it that quick glance I didn't notice it.
>
>> I cannot create behind-the-scenes upstream structure, because at the moment it is unreasonable to not delegate the proxying logic to proxy module. On the other hand, the current nginx API does not allow to flexibly inherit proxy module's functionality.
>
> I think for mod_mogilefs to be the best it can be, this needs to be
> supported. Either that, or mod_mogilefs just does the same reproxying
> that an upstream construct would do, just not natively using nginx's
> upstream construct.
>
> Essentially you would want to mimic perlbal's functionality, and then
> you might win over dormando and he'll push your solution possibly ;)

I'll take a look what I can do about that.

--
Regards,
Valery Kholodkov

Michael Shadle

unread,
Apr 23, 2009, 11:54:18 AM4/23/09
to mog...@googlegroups.com
On Thu, Apr 23, 2009 at 1:21 AM, Valery Kholodkov <val...@grid.net.ru> wrote:

>> Essentially you would want to mimic perlbal's functionality, and then
>> you might win over dormando and he'll push your solution possibly ;)
>
> I'll take a look what I can do about that.

Might be worth a shot to look at extending nginx's core functionality.
If you were able to implement all the perlbal functionalities, nginx
would become a much richer load balancer (and you could possibly
introduce dynamic upstreams, which would allow for less code in
mod_mogilefs too :))

Reply all
Reply to author
Forward
0 new messages