If I were to design such a protocol I would like it to have a robust way
for peers to negotiate things like what extensions they support, what
piece size they use and so on. Besides that a simple way for them to
exchange pieces and to tell other peers what they have and whether they
are interested in downloading and/or uploading (pretty much like bt, but
perhaps a bit more flexible). Bittorrent does most things "the right"
way when it comes to this kind of downloading, but what I would do
differently is to add more flexibility. One interesting idea would be to
allow different piece sizes between different peers or perhaps to not
use "pieces" in the p2p protocol at all, but rather something like
arbitrary byte ranges.
A simple protocol with greater flexibility and perhaps some new
interesting ways to find other peers could be a useful alternative to
bittorrent (and traditional downloading) in my opinion. To use
metalinks, instead of a new (unknown) file format, would also be a great
idea!
Hope you found some of my thoughts and ideas to be interesting. It's a
very interesting subject and I really hope that someone creates
something like that one day (and makes the protocol open and flexible).
/ Hampus
from your response it seems we are on the same track.
what I envision is a system very similar to BitTorrent's client-server
model, but utilizing .metalink files.
clients will handle client to client communication based on the
protocol being used (p2p or other)
however, client to server communication could be similar to
BitTorrent.
since BitTorrent already exists & this type of system has proven
itself as viable, I think it would be wise to follow it's framework in
some aspects.
why would you not suggest using HTTP for client to client & client to
server communication?
> clients will handle client to client communication based on the
> protocol being used (p2p or other)
CLARIFICATION:
==============
in the event of a non-p2p protocol, for example HTTP or FTP, why would
you not suggest using HTTP for client-to-client communication?
(chunks, etc.)
as you mentioned a simple protocol is all that would be needed in
these cases?
what are the pros & cons of using HTTP vs a custom protocol in this
context?
The japanese author of aria2 will implement this possibility in the next
release 0.12.X
I spoken with antbryan two weeks ago of this challenge and the only
trouble is for me
the filesize of every small hash code... a temporary solution when one
P2P protocol
as bittorrent is used (and no other P2P method to avoid conflicts) is to
use default
bittorrent hash size method... this proposal has been added in the specs
recently
in the "bittorent column" because this P2P protocol is well known used
in metalinks...
"If BitTorrent is used with Metalink, it is beneficial to use the same
chunk size and hash with both."
now this is the BT specs:
*****************************************************************************************
piece length maps to the number of bytes in each piece the file is
split into. For the purposes of transfer, files are split into
fixed-size pieces which are all the same length except for possibly
the last one which may be truncated. piece length is almost always a
power of two, most commonly 2 18 = 256 K (BitTorrent prior to
version 3.2 uses 2 20 = 1 M as default).
pieces maps to a string whose length is a multiple of 20. It is to
be subdivided into strings of length 20, each of which is the SHA1
hash of the piece at the corresponding index.
There is also a key length or a key files, but not both or neither.
If length is present then the download represents a single file,
otherwise it represents a set of files which go in a directory
structure.
In the single file case, length maps to the length of the file in
bytes.
***************************************************************************************
now another info from http://wiki.theory.org/BitTorrentSpecification
* The piece length specifies the nominal piece size, and is usually
a power of 2. The piece size is typically chosen based on the total
amount of file data in the torrent, constrained by the fact that
piece sizes too large cause inefficiency, and too small a piece size
will result in a large .torrent metadata file. The conventional
wisdom used to be to pick the smallest piece size that results in a
.torrent file no greater than approx. 50 - 75 kB (presumably to ease
the load on the server hosting the torrent files). However, now that
hosting storage and bandwidth are not tightly constrained, it is
best to keep the piece size to 512KB or less, at least for torrents
under 8-10GB or so, even if that results in a larger torrent file,
in order to have a more efficient swarm for sharing files. The most
common sizes are 256 kB, 512 kB, and 1 MB. Every piece is of equal
length except for the final piece, which is irregular. The number of
pieces is thus determined by 'ceil( total length / piece size )'.
For the purposes of piece boundaries in the multi-file case,
consider the file data as one long continuous stream, composed of
the concatenation of each file in the order listed in the files
list. The number of pieces and their boundaries are then determined
in the same manner as the case of a single file. Pieces may overlap
file boundaries.
* Each piece has a corresponding SHA1 hash of the data contained
within that piece. These hashes are concatenated to form the pieces
value in the above info dictionary. Note that this is not a list but
rather a single string. The length of the string must be a multiple
of 20.
***********************************************************************************
(sorry my english level is not enough to understood all of these 2 texts
:)
> Hampus wrote:
> For the swarming to work effectively I think the clients would need to
> be able to constantly update the other clients about what parts of the
> file they are offering (like bt, or in a new and smarter way).
yes, this is correct. in the earlier brainstorming sessions I was
actually thinking of a set of Metalink extensions for the BitTorrent
protocol, but this would require the client to support BitTorrent as
well as the Metalink extensions.
so, basically we have a tradeoff here. build upon an existing
protocol or create a new one.
I believe the shortest path to deployment would be to extend BT.
is this a viable solution or is the custom route better?
> Hampus wrote:
> FTP is not very good at requesting small byte-ranges (you must
> disconnect when you don't want any more data), so that's out of the
> question in my opinion.
I have to admit to not being an expert on FTP, but I believe several
download managers have the ability to download the same data source
simultaneously from multiple FTP servers. I am pretty sure ReGet does
this. I also think it swarms HTTP & FTP streams concurrently.
however, I have never seen a client that combines FTP & p2p swarming.
there is a FTP client called Fastream NetFile, which was previously
known as FTP++ P2P. I have never used it, but it claimed to enable
p2p & FTP swarming in the past. I don't see this wording on their
site anymore.
I'm not sure if we understood each other in the previous messages, but
I was referring to using the custom p2p protocol in question to enable
p2p / FTP swarming between clients for large FTP transfers (*nix ISOs,
etc).
As for your first question: what would you use to do P2P and FTP at
the same time.
Use Phex, it has both FTP and P2P capabilities. With just a few edits
and a high value for the preference attribute in the metalink magnet
link, you would have this working. So, as far as I can see, this is
(almost) already being done.
One of the benefits of metalinks is that a tracker is not needed and
you don't need a very special client to get going. You could use any
P2P network, as far as metalinks are concerned. Some of the metalinks
on metamirrors.nl also allow you to use anonymous P2P systems, like
GNUnet.
As for your last question: "what would you do better?"
I would allow the client to solve more of it's problems, by giving the
client more information which will allow the client to solve it's own
problems. This is just what Metalinks are doing.
If you provider blocks or throttles bittorrent, use any of the other
protocols. If you only have HTTP support, use that! If you only have
SMTP support, try GNUnet through SMTP. If you have a weird admin ffing
with your data, or corruption on the line, make sure the client knows
and is able to fix it. If you can't reach outside of your country, use
any of the links from inside it.
The whole idea behind metalinks is that "another protocol" is not the
solution, every protocol will have some kind of disadvantage in some
kind of situation: give the client the freedom, and knowledge, to fix
any of the problems it may encounter.
Bram
Commercial: Join the metalinks forum at http://metamirrors.nl/forum
ok so phex is multiprotocol as getright ;)
thanks for the replies Hampus & Sebastien.Hampus wrote: For the swarming to work effectively I think the clients would need to be able to constantly update the other clients about what parts of the file they are offering (like bt, or in a new and smarter way).yes, this is correct. in the earlier brainstorming sessions I was actually thinking of a set of Metalink extensions for the BitTorrent protocol, but this would require the client to support BitTorrent as well as the Metalink extensions. so, basically we have a tradeoff here. build upon an existing protocol or create a new one. I believe the shortest path to deployment would be to extend BT. is this a viable solution or is the custom route better?
Hampus wrote: FTP is not very good at requesting small byte-ranges (you must disconnect when you don't want any more data), so that's out of the question in my opinion.I have to admit to not being an expert on FTP, but I believe several download managers have the ability to download the same data source simultaneously from multiple FTP servers. I am pretty sure ReGet does this. I also think it swarms HTTP & FTP streams concurrently. however, I have never seen a client that combines FTP & p2p swarming. there is a FTP client called Fastream NetFile, which was previously known as FTP++ P2P. I have never used it, but it claimed to enable p2p & FTP swarming in the past. I don't see this wording on their site anymore.
I'm not sure if we understood each other in the previous messages, but I was referring to using the custom p2p protocol in question to enable p2p / FTP swarming between clients for large FTP transfers (*nix ISOs, etc).
It's not a complete download manager, but I might make one, based on the lib, in the future.
AFAIK they are 3 softwares which use this method: getright (done), aria2
(in the next core release) and you...
what is exactly the method used by bittorrent about filehash, specs are
too complicated
for me... any examples ?
> whole file. The pieces are often 256 kB for small files and can be quite
> large for very large files. Small pieces lets the client find bad data
> sooner, but makes the torrent-file larger.
>
> Metalinks with "chunk checksums" (called "piece checksums" in BT
> terminology) work exactly the same way, but the format is different of
> course. Both use SHA-1 for the hashes.
what do you mean by format, the content of torrent or metalink itself ?
for the moment as described in the newest metalink 3.0 specs, we can use
BT filehash
if a .torrent is included and use your own filehash which will be
detailled
(very very very) soon if a torrent link is not included in metalink...
IMHO, filehash numbers of a file can not be more than 50... this number
is
an idea based on the famous bittorrent proverb "Small pieces lets the
client find bad data
> sooner, but makes the torrent-file larger." ;)
I don't get it; what's wrong with BitTorrent? Why do you want to
reinvent the wheel? The protocol is already tested and good. It
supports extensions and exchanging supported extensions during peer
connection. The download requests between clients actually ask for a
byte range, usually for a small piece (32KB), so clients can use
different piece size*. These are sub-pieces of the .torrent-defined
"chunk". Documentation: http://tinyurl.com/32k7wu
You can find peers in whatever way you want to without disrupting the
rest of the protocol. A lot of clients support DHT, which is an
official extension. There is also peer exchange, where peer A can tell
peer B about peers G, X, and J that he knows about and B might not
know. (Some people claim PEX can actually be harmful for the swarm).
Azureus supports manually typing the IP and port of a known peer, for
example (via a plugin, Peer Injector), and finding peers on your LAN;
these don't involve *any* extension or change on the BitTorrent
protocol itself. Any Azureus plugin may add peers too.
* (although some clients really don't like it when other clients ask
for big pieces, and drop the connection immediately)
On Sat, 22 Sep 2007 21:39:58 +0200, "Hampus Wessman" <h...@vox.nu> said:Sebastien WILLEMIJNS skrev:whole file. The pieces are often 256 kB for small files and can be quite large for very large files. Small pieces lets the client find bad data sooner, but makes the torrent-file larger. Metalinks with "chunk checksums" (called "piece checksums" in BT terminology) work exactly the same way, but the format is different of course. Both use SHA-1 for the hashes.what do you mean by format, the content of torrent or metalink itself ?
It supports extensions and exchanging supported extensions during peer connection.
The download requests between clients actually ask for a byte range, usually for a small piece (32KB), so clients can use different piece size*. These are sub-pieces of the .torrent-defined "chunk". Documentation: http://tinyurl.com/32k7wu
You can find peers in whatever way you want to without disrupting the rest of the protocol. A lot of clients support DHT, which is an official extension. There is also peer exchange, where peer A can tell peer B about peers G, X, and J that he knows about and B might not know. (Some people claim PEX can actually be harmful for the swarm). Azureus supports manually typing the IP and port of a known peer, for example (via a plugin, Peer Injector), and finding peers on your LAN; these don't involve *any* extension or change on the BitTorrent protocol itself. Any Azureus plugin may add peers too.
> I don't get it; what's wrong with BitTorrent? Why do you want to
> reinvent the wheel?
the trouble is for me all P2P protocols do not use same hashing
methods...
> I don't get it; what's wrong with BitTorrent? Why do you want to
> reinvent the wheel?
seems you have misread something here, Nicolas.
we're not discussing re-inventing the wheel.
the issue is how to implement swarming between non-p2p protocols &
what is the best way to do this?
the great thing about Metalik is that virtually any protocol can be
defined within a Metalink file.
it is the client's responsibility to handle the protocols.
however, if a developer wishes to implement swarming between non-p2p
protocols we should devise a standard solution that everyone can use.
one way is to have the clients report to a tracker-style server using
HTTP. (similar to BT)
to do this the most effective way, the clients wishing to swarm non-
p2p protocols should be able to talk w/ each other in some way also.
I plan on starting a wiki & creating some diagrams soon of my ideas
for such a system. I hope that everyone here will contribute their
ideas & provide feedback on this.
what you have mentioned directly above this post is the basic premise
of what I have been trying to describe.
also, as you mentioned towards the end of your message, the biggest
challenge here is deciding on a protocol to use...
I have attached a basic mock-up of the idea in my head.
as you can see from the diagrams the system resembles BT's client /
server system w/ the major difference being that it supports multiple
protocols due to the usage of Metalink files.
Figure 1a displays the initial connections of the clients:
http://img503.imageshack.us/img503/835/metamirrorserverfig1aeb4.gif
Figure 1b displays the Metalink / MetaMirror swarm full circle:
http://img207.imageshack.us/img207/7136/metamirrorserverfig1bgw5.gif
the swarm comes full circle because the clients share source / peer
info w/ each other as well as the MetaMirror server.
if anyone needs an explanation of the diagrams please let me know.
it will be done... only if you give a lot of examples ;)
some ideas:
A temporary dynamic database (knoppix-5-1.1.iso.metalink.temp) which
contains some lines piece number 1,2,3,6,8 is complete, piece 4,5 and 7
aren't complete)
+ easy to implement / may be included in the metalink file itseld with a
dynamic flag
<hash
piece="3">dd997bac444b1d1cfb50ef35bacbf2c643b1e3de<received>notyet</received></hash><
<hash
piece="4">dd123456788997845340ef35bacbf2c643b1e3de<received>yes</received></hash><
- if a protocol do a mistake when writing, the database content is
false...
Our own filehashing:
+ partial files can be easily transfered between protocols because
filesize/filehash is common
- must be an universal method for all P2P protocols
i prefer examples of technical text ;)
You may want to use this:
http://groups.google.com/group/metalink-discussion/web
this is not meant to replace this group only supplement it.
once the ideas are hashed out I will also setup a wiki for the
technical details.
it appears Dijjer requires a client to access it's CDN. I'm not sure
how many devs would implement it in their clients...
however, a CDN like Coral[1] can be accessed by any client without
special software.
> doesn't this actually equate to a single point of failure?
> what happens when http://dijjer.org/get/ is offline?
> Dijjer & other CDN URLs could be included in .metalink files which
> makes SPOF a non-issue.
> it appears Dijjer requires a client to access it's CDN. I'm not sure
> how many devs would implement it in their clients...
about dijjer, the 4 files download examples given me no... files grin !
(404 and crash software...) project seems dead
doesn't this actually equate to a single point of failure?
what happens when http://dijjer.org/get/ is offline?
Dijjer & other CDN URLs could be included in .metalink files which
makes SPOF a non-issue.
it appears Dijjer requires a client to access it's CDN. I'm not sure
how many devs would implement it in their clients...
however, a CDN like Coral[1] can be accessed by any client without
special software.
[1] http://www.coralcdn.org/
On Sep 23, 9:53 pm, "Aren Olson" <reacoc...@gmail.com> wrote:
> I think something like this ad-hoc swarming has already been done, it's
> called dijjer: http://dijjer.org/
> Perhaps we could just tie into their network?
>
> Aren Olson
>
www.ubuntu.com
what exactly this software does ? a webinterface client ? and ?
www.ubuntu.com
ok thanks of your answer. i think this P2P protocol with an HTTP/FTP
downloader
will give us no new information for metalink filehash...