Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
using no cache
Reading datafile = sampledata.vw
num sources = 1
Enabled reductions: gd, scorer, csoaa_ldf, cb_adf, shared_feature_merger, cb_to_cbadf
average since example example current current current
loss last counter weight label predict features
[critical] vw (cb_adf.cc:279): cb_adf: badly formatted example, only one cost can be known.
Why are these reductions csoaa_ldf, cb_adf, shared_feature_merger, cb_to_cbadfvw --cb_explore 2 --cover 10 -d train.txt -l 0.1 --power_t 0 --save_resume vw.model --l1 1e-08
; since --power_t
is set to 0 and the learning rates don't decay, whether or not having--save_resume
makes no difference to the model performance?
I am trying to load model from a file
modelfromfile = pyvw.vw(quiet=True).load('some.model')
But I am getting following error --
AttributeError: 'vw' object has no attribute 'load'
Hey guys, I was wondering... what influences the time to load a model? I've tested model files of different sizes, but it doesn't seem like there's any correlation.
Is it the total number weights?
The number of non-null weights? (I don't think so because this is correlated to file size - which isn't correlated to loading time)... and by the way, is there an easy way to count the number of non-null weights? I'm currently iterating on all weights and counting those that aren't 0
Good morning everyone... I have a simple question regarding one of the Personalizer's examples.
Why the JSON input in this example defines the features as an array of objects? Why not simply an object? Does it have anything to do with Namespaces?
Hi all, I was wondering how lrq (aka factorization machine) plays with Importance Weight Aware Updates (https://arxiv.org/pdf/1011.1576.pdf) since the development in that paper is for linear models:
In this paper we focus on linear models i.e. p = <w, x> where w is a vector of weights
but lrq models are not linear given that they involve products of weights. So what about the property:
Therefore all gradients of a given example point to the same direction and only differ in magnitude.
that's assumed in the paper? I wasn't able to find any related discussion.
vw --active --port 6075 --daemon --foreground -i existing_model_filename.ext
Then I run python3.9 active_iterator.py localhost 6075 unlabeled_examples_filename.ext
. All of the over 800K unlabeled examples return an importance of exactly 1.0, even though the predictions are variable and largely accurate. In the past I have received highly useful and variable importance weights, and I cannot figure out what is wrong now. The only possibility that even occurs to me is that in active_iterator.py I had to change the first line of the recvall function to buf=s.recv(n).decode() from buf=s.recv(n), and I changed the sendall calls from sock.sendall(line) to sock.sendall(line.encode()). Any ideas? Thanks very much.
VWMultiClassifier.fit
only supports a 1-d y
Hello everyone. I have problems with convergence for --cv_explore_adf
on simple synthetic dataset of 3 classes and 5 features, all informative (the data was simulated via scikit-learn's make_classification
)
Here's a gist and an output plot above. Basically, it doesnt learn the difference b/w classes and oerforms no better than baseline (predict one class each time)/ tried w/ softmax exploration w/ different lambdas (since it's not bound by epsilon-soft convergence bound like e-greedy policy), same results
FYI, sklearn's softmax multiclass classifier gets 80% mean accuracy. So I really don't understand what might be the problem with VW (maybe smaller learning rate needed?). The documantation and the lack of --audit
support in Python certainly don't help.
https://gist.github.com/ColdTeapot273K/76a91a0416cb9b8ecb114e625f88f4a0
Good first issue
label and se if anything interests you