This is a very interesting question, and I haven't tried to solve the problems you mention, but I don't think it's very difficult. (I'm familiar with the problems you describe since I use to work on HFT strategies for futures - though I never traded wheat, but oil and natgas present similar problems.)
- For the instruments, I'd use the same names as the exchange, including contract expiration codes. Just so it's nice. Also for spreads (more below).
- For margins, since those are held in separate accounts in cash, what you'd have to do is derive the payment requirements at EOD based on your positions. This would ideally be done by a script. Regardless, I would record the actual payments made in the cash accounts and the computations would be there for you to compute the daily transfers required.
- In order to mark to market you'd have to treat this like a cost basis adjustment, that is, make a transaction that empties out the account and replaces it with a new positions with the adjusted cost basis (and a cash leg). Not be super pretty since you'd have one of those every day, but I think it would work.
- For hedging of differing contract sizes, I'd attach the contract sizes as meta-data on the commodity definitions and write a script to actual compute net exposures in terms of units (e.g., bushels). That's also how you'd deal with calendar spreads, you'd have a single instruments representing a long-short position, tracking your actual contract positions, and a script that computes your net/hedged exposure.
All in all, I don't think any of this is very challenging conceptually though it definitely requires some custom coding to produce the kinds of reports you would need.
Finally, if this is for infrequent investing or hedging of small numbers of contracts in a personal account it would make sense to include these in one's usual personal balance sheet, so I can see how using Beancount would be beneficial in that case, but if you're doing frequent trading (e.g. pairs trading with dozens or hundreds of trades/day) I think it that case you'd probably be better off writing something more custom and less general than Beancount, something just for your trading, based off of a table. All the reports you'd really use would be your custom ones anyhow. In that latter case I'm not sure how much benefit you'd get from the features Beancount provides and it might just be too slow for what you need.
In any case, let us know how it goes,