pip install ydf datasets tiktoken -U
from itertools import islice
from datasets import load_dataset
import tiktoken
import ydf
About this tutorial¶
This tutorial shows how to train a text classifier on the AG News using categorical-set both on a white-space and GPT tokenizer.
What are text & categorical-set features?¶
A categorical-set feature is a type of input feature where each value is a set (i.e., a list of non-ordered) categorical values. They differ from classical categorical feature where each value is a single categorical value.
Examples of categorical value | Examples of categorical-set value |
---|---|
RED | {RED} |
BLUE | {RED, BLUE} |
BLUE | {} |
missing | missing |
Why using categorical-set features?¶
Categorical-set features are useful for tags-like or tokenizable features like text and URLs, especially with small datasets. For example, the text feature value "I eat an applepie!" becomes the categorical-set feature value {"I", "eat", "an", "applepie!"} using whitespace splitting uni-gram tokenization. Bi-gram or tri-gram tokenization can also be used and sometimes give good results. For example, the same text becomes {"I_eat", "eat_an", "an_applepie!"} with bigrams.
While simple, whitespace splitting does not works with punctuation and with some languages. If possible, prefer modern tokenizers such as Google's Sentencepiece or OpenAI's Tiktoken, or Transformer tokenizers.
In this tutorial, we will use tiktoken with the "r50k_base" configuration used by GPT-2 and some GPT-3 models. For example, the text above will become: ["I", "eat", "_an", "_apple", "pie", "!"]
Loading the dataset¶
We are working with the AG News dataset. The goal of this dataset is to predict the type of an article from its text. It is a classical text classification dataset.
Let's load the dataset.
def ag_news_dataset(split: str):
class_mapping = {
0: "World",
1: "Sports",
2: "Business",
3: "Sci/Tech",
}
for example in load_dataset("ag_news")[split]:
yield {
"text": example["text"],
"label": class_mapping[example["label"]],
}
# Print the first 3 training examples
for example_idx, example in enumerate(islice(ag_news_dataset("train"), 3)):
print(f"==========\nExample #{example_idx}\n----------")
print(example)
========== Example #0 ---------- {'text': "Wall St. Bears Claw Back Into the Black (Reuters) Reuters - Short-sellers, Wall Street's dwindling\\band of ultra-cynics, are seeing green again.", 'label': 'Business'} ========== Example #1 ---------- {'text': 'Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\\which has a reputation for making well-timed and occasionally\\controversial plays in the defense industry, has quietly placed\\its bets on another part of the market.', 'label': 'Business'} ========== Example #2 ---------- {'text': "Oil and Economy Cloud Stocks' Outlook (Reuters) Reuters - Soaring crude prices plus worries\\about the economy and the outlook for earnings are expected to\\hang over the stock market next week during the depth of the\\summer doldrums.", 'label': 'Business'}
We define our tokenizer function that will convert a text into a categorical-set.
A whitespace tokenier can be implemented as follow:
def tokenize_white_space(text):
return text.split(" ")
tokenize_white_space("I eat an applepie!")
['I', 'eat', 'an', 'applepie!']
Alternatively, we Tiktoken tokenizer can be implemented a follow:
gpt2_tokenizer = tiktoken.get_encoding("r50k_base")
def tokenize_gpt2(text):
return gpt2_tokenizer.decode_tokens_bytes(gpt2_tokenizer.encode(text))
tokenize_gpt2("I eat an applepie!")
[b'I', b' eat', b' an', b' apple', b'pie', b'!']
We will use this last tokenizer.
tokenize = tokenize_gpt2
Let's load more data and apply both tokenizers. For this example to run quickly, we will only use 10k training examples.
YDF supports different formats (see, ydf.help.loading_data()
). We will use python dictionaries.
def create_dataset(split):
labels = []
tokens = []
for example in islice(ag_news_dataset(split), 10_000):
labels.append(example["label"])
tokens.append(tokenize(example["text"]))
return {"label": labels, "token": tokens}
train_dataset = create_dataset("train")
test_dataset = create_dataset("test")
Let's look at at the first training example:
print("label:", train_dataset["label"][0])
print("tokens:", train_dataset["token"][0])
label: Business tokens: [b'Wall', b' St', b'.', b' Bears', b' Claw', b' Back', b' Into', b' the', b' Black', b' (', b'Reuters', b')', b' Reuters', b' -', b' Short', b'-', b'sell', b'ers', b',', b' Wall', b' Street', b"'s", b' dwindling', b'\\', b'band', b' of', b' ultra', b'-', b'cy', b'n', b'ics', b',', b' are', b' seeing', b' green', b' again', b'.']
Train model¶
We can now train our models.
learner = ydf.RandomForestLearner(label="label", features=[("token", ydf.Semantic.CATEGORICAL_SET)])
model = learner.train(train_dataset)
Train model on 10000 examples Model trained in 0:00:29.219372
It is always a good idea to check the models's description.
model.describe()
Task : CLASSIFICATION
Label : label
Features (1) : token
Weights : None
Trained with tuner : No
Model size : 86546 kB
Number of records: 10000 Number of columns: 2 Number of columns by type: CATEGORICAL_SET: 1 (50%) CATEGORICAL: 1 (50%) Columns: CATEGORICAL_SET: 1 (50%) 0: "token" CATEGORICAL_SET has-dict vocab-size:2001 num-oods:130838 (1308.38%) most-frequent:"<OOD>" 130838 (1308.38%) dtype:DTYPE_BYTES CATEGORICAL: 1 (50%) 1: "label" CATEGORICAL has-dict vocab-size:5 zero-ood-items most-frequent:"Sci/Tech" 2662 (26.62%) dtype:DTYPE_BYTES Terminology: nas: Number of non-available (i.e. missing) values. ood: Out of dictionary. manually-defined: Attribute whose type is manually defined by the user, i.e., the type was not automatically inferred. tokenized: The attribute value is obtained through tokenization. has-dict: The attribute is attached to a string dictionary e.g. a categorical attribute stored as a string. vocab-size: Number of unique values.
The following evaluation is computed on the validation or out-of-bag dataset.
Number of predictions (without weights): 10000 Number of predictions (with weights): 10000 Task: CLASSIFICATION Label: label Accuracy: 0.8668 CI95[W][0.861082 0.87236] LogLoss: : 0.540504 ErrorRate: : 0.1332 Default Accuracy: : 0.2662 Default LogLoss: : 1.38522 Default ErrorRate: : 0.7338 Confusion Table: truth\prediction Sci/Tech World Business Sports Sci/Tech 2356 106 149 51 World 152 2155 90 126 Business 368 94 1979 36 Sports 88 50 22 2178 Total: 10000
Variable importances measure the importance of an input feature for a model.
1. "token" 1.000000
1. "token" 300.000000
1. "token" 144392.000000
1. "token" 3971538.389777
Those variable importances are computed during training. More, and possibly more informative, variable importances are available when analyzing a model on a test dataset.
Only printing the first tree.
Tree #0: "token" is in {ENS, Greece, Gold, basketball, right, meters, Cup, Hamm, football, Yankees, ...[9 left]} [s:0.125663 n:10000 np:1281 miss:0] ; val:"Sci/Tech" prob:[0.2652, 0.2493, 0.2444, 0.2411] ├─(pos)─ "token" is in [BITMAP] {ENS, win, U, next, P, icker, Minister, while, month, Up, ...[53 left]} [s:0.0681851 n:1281 np:950 miss:0] ; val:"Sports" prob:[0.0483997, 0.0960187, 0.0179547, 0.837627] | ├─(pos)─ "token" is in {..., city, business, pay, war, Israeli, killing, campaign, CP} [s:0.254497 n:950 np:117 miss:0] ; val:"Sports" prob:[0.00736842, 0.113684, 0.00210526, 0.876842] | | ├─(pos)─ "token" is in {/, giant, But, If, Air, gets, advertising, We, increased, ash} [s:0.423756 n:117 np:23 miss:0] ; val:"World" prob:[0.0598291, 0.803419, 0.017094, 0.119658] | | | ├─(pos)─ "token" is in { an, gt, No, pool, known} [s:0.614503 n:23 np:16 miss:0] ; val:"Sports" prob:[0.26087, 0.0434783, 0.0869565, 0.608696] | | | | ├─(pos)─ "token" is in { for, her, where} [s:0.37677 n:16 np:14 miss:0] ; val:"Sports" prob:[0, 0, 0.125, 0.875] | | | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ "token" is in { Americans} [s:0.410116 n:7 np:1 miss:0] ; val:"Sci/Tech" prob:[0.857143, 0.142857, 0, 0] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { if} [s:0.0441666 n:94 np:2 miss:0] ; val:"World" prob:[0.0106383, 0.989362, 0, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0.5, 0, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in {AFP, China, within, 22, Some} [s:0.0303842 n:833 np:34 miss:0] ; val:"Sports" prob:[0, 0.0168067, 0, 0.983193] | | ├─(pos)─ "token" is in { U, gold, Friday, Sunday, team, her, tournament, AT} [s:0.605797 n:34 np:24 miss:0] ; val:"Sports" prob:[0, 0.294118, 0, 0.705882] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { Kerry} [s:0.0140242 n:799 np:2 miss:0] ; val:"Sports" prob:[0, 0.00500626, 0, 0.994994] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { All, worth} [s:0.00912285 n:797 np:22 miss:0] ; val:"Sports" prob:[0, 0.00250941, 0, 0.997491] | | ├─(pos)─ "token" is in { Monday, put} [s:0.304636 n:22 np:2 miss:0] ; val:"Sports" prob:[0, 0.0909091, 0, 0.909091] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ "token" is in {$, ", public, sales, 20, phone, called, also, ud, workers, ...[10 left]} [s:0.269812 n:331 np:71 miss:0] ; val:"Sports" prob:[0.166163, 0.0453172, 0.0634441, 0.725076] | ├─(pos)─ "token" is in { that, 't, music, rights} [s:0.30377 n:71 np:38 miss:0] ; val:"Sci/Tech" prob:[0.464789, 0.169014, 0.267606, 0.0985915] | | ├─(pos)─ "token" is in { are, Thursday, had, can, do, .', Computer, media} [s:0.418525 n:38 np:31 miss:0] ; val:"Sci/Tech" prob:[0.789474, 0.105263, 0.105263, 0] | | | ├─(pos)─ "token" is in { Inc} [s:0.0809077 n:31 np:3 miss:0] ; val:"Sci/Tech" prob:[0.967742, 0.0322581, 0, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[0.666667, 0.333333, 0, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { he} [s:0.682908 n:7 np:3 miss:0] ; val:"Business" prob:[0, 0.428571, 0.571429, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { their, two, when, China, to} [s:0.36054 n:33 np:12 miss:0] ; val:"Business" prob:[0.0909091, 0.242424, 0.454545, 0.212121] | | ├─(pos)─ "token" is in {er, ly} [s:0.636514 n:12 np:4 miss:0] ; val:"Sports" prob:[0.25, 0.333333, 0, 0.416667] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { his, against, Aug, race} [s:0.661563 n:8 np:5 miss:0] ; val:"Sports" prob:[0.375, 0, 0, 0.625] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in {A} [s:0.314492 n:21 np:2 miss:0] ; val:"Business" prob:[0, 0.190476, 0.714286, 0.0952381] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in { say, an, York} [s:0.514653 n:19 np:4 miss:0] ; val:"Business" prob:[0, 0.210526, 0.789474, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { million, Windows, attack, video, Linux, Intel, rate} [s:0.266909 n:260 np:22 miss:0] ; val:"Sports" prob:[0.0846154, 0.0115385, 0.00769231, 0.896154] | ├─(pos)─ "token" is in { at} [s:0.184907 n:22 np:1 miss:0] ; val:"Sci/Tech" prob:[0.954545, 0.0454545, 0, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { international} [s:0.0485287 n:238 np:2 miss:0] ; val:"Sports" prob:[0.00420168, 0.00840336, 0.00840336, 0.978992] | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in {A, Power} [s:0.0346642 n:236 np:5 miss:0] ; val:"Sports" prob:[0.00423729, 0, 0.00847458, 0.987288] | ├─(pos)─ "token" is in {), by} [s:0.673012 n:5 np:3 miss:0] ; val:"Sports" prob:[0, 0, 0.4, 0.6] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { The} [s:0.00721866 n:231 np:44 miss:0] ; val:"Sports" prob:[0.004329, 0, 0, 0.995671] | ├─(pos)─ "token" is in { and} [s:0.0302433 n:44 np:12 miss:0] ; val:"Sports" prob:[0.0227273, 0, 0, 0.977273] | | ├─(pos)─ "token" is in {-, S, out, Hamm} [s:0.171311 n:12 np:10 miss:0] ; val:"Sports" prob:[0.0833333, 0, 0, 0.916667] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sci/Tech" prob:[0.5, 0, 0, 0.5] | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] └─(neg)─ "token" is in {$, REF, search, based, stocks, Co, rose, services, Research, per, ...[35 left]} [s:0.114344 n:8719 np:2543 miss:0] ; val:"Sci/Tech" prob:[0.297052, 0.27182, 0.277669, 0.153458] ├─(pos)─ "token" is in { oil, Iraq, Oil, quick, os, fur, fall, Shi, ics, Canada, ...[19 left]} [s:0.157527 n:2543 np:734 miss:0] ; val:"Business" prob:[0.373575, 0.0696028, 0.529689, 0.0271333] | ├─(pos)─ "token" is in { world, 2, com, which, lt, Corp, 1, what, stock, .,, ...[41 left]} [s:0.129166 n:734 np:491 miss:0] ; val:"Business" prob:[0.0177112, 0.106267, 0.874659, 0.0013624] | | ├─(pos)─ "token" is in {ing, In, cost, life, told, net, study, ver} [s:0.0525037 n:491 np:74 miss:0] ; val:"Business" prob:[0.0264766, 0.00203666, 0.971487, 0] | | | ├─(pos)─ "token" is in {3, stock, life, these, panel} [s:0.160153 n:74 np:6 miss:0] ; val:"Business" prob:[0.175676, 0, 0.824324, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ "token" is in { World, online, SP, Fl} [s:0.161811 n:68 np:6 miss:0] ; val:"Business" prob:[0.102941, 0, 0.897059, 0] | | | | ├─(pos)─ "token" is in { file} [s:0.219512 n:6 np:2 miss:0] ; val:"Sci/Tech" prob:[0.833333, 0, 0.166667, 0] | | | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ "token" is in { as} [s:0.00907112 n:62 np:11 miss:0] ; val:"Business" prob:[0.0322581, 0, 0.967742, 0] | | | | ├─(pos)─ "token" is in { broadband} [s:0.304636 n:11 np:1 miss:0] ; val:"Business" prob:[0.0909091, 0, 0.909091, 0] | | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ "token" is in { study} [s:0.0693267 n:51 np:2 miss:0] ; val:"Business" prob:[0.0196078, 0, 0.980392, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in {...} [s:0.0114689 n:417 np:4 miss:0] ; val:"Business" prob:[0, 0.00239808, 0.997602, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0.25, 0.75, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { US, company, /, can, h, like, 36, Y, IC, quarter, ...[10 left]} [s:0.214604 n:243 np:105 miss:0] ; val:"Business" prob:[0, 0.316872, 0.679012, 0.00411523] | | ├─(pos)─ "token" is in {Iraq} [s:0.0538017 n:105 np:1 miss:0] ; val:"Business" prob:[0, 0.00952381, 0.990476, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in {M, no, ', Australia, biggest, six, industry, eight, race, ic, ...[10 left]} [s:0.296415 n:138 np:50 miss:0] ; val:"World" prob:[0, 0.550725, 0.442029, 0.00724638] | | ├─(pos)─ "token" is in { O} [s:0.0980391 n:50 np:1 miss:0] ; val:"World" prob:[0, 0.98, 0, 0.02] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in {'s, Aug, if, being, V, rose} [s:0.187208 n:88 np:48 miss:0] ; val:"Business" prob:[0, 0.306818, 0.693182, 0] | | ├─(pos)─ "token" is in { AP, ..., ers, 23} [s:0.327224 n:48 np:18 miss:0] ; val:"World" prob:[0, 0.541667, 0.458333, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { will} [s:0.291194 n:30 np:13 miss:0] ; val:"Business" prob:[0, 0.266667, 0.733333, 0] | | | ├─(pos)─ "token" is in { Google} [s:0.281354 n:13 np:3 miss:0] ; val:"World" prob:[0, 0.615385, 0.384615, 0] | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ "token" is in { for, ets} [s:0.223144 n:10 np:6 miss:0] ; val:"World" prob:[0, 0.8, 0.2, 0] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ val:"World" prob:[0, 0.5, 0.5, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { all} [s:0.116907 n:40 np:1 miss:0] ; val:"Business" prob:[0, 0.025, 0.975, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { Up, buy, higher, Japan, reports, nearly, work, close, head, quarterly, ...[34 left]} [s:0.103046 n:1809 np:608 miss:0] ; val:"Sci/Tech" prob:[0.517966, 0.0547264, 0.389718, 0.0375898] | ├─(pos)─ "token" is in { -, new, G, shares, lead, Says, Research, America, rise, full, ...[27 left]} [s:0.104344 n:608 np:412 miss:0] ; val:"Business" prob:[0.251645, 0.0394737, 0.697368, 0.0115132] | | ├─(pos)─ "token" is in {'s, com, through, while, online, six, ly, North, customers, mobile, ...[11 left]} [s:0.190259 n:412 np:131 miss:0] ; val:"Business" prob:[0.126214, 0.0533981, 0.820388, 0] | | | ├─(pos)─ "token" is in {?, Friday, i, third, group, another, there, I, old, work, ...[8 left]} [s:0.267799 n:131 np:54 miss:0] ; val:"Business" prob:[0.366412, 0.160305, 0.473282, 0] | | | | ├─(pos)─ "token" is in { as, U, Inc, maker, quarter, she, housing} [s:0.504245 n:54 np:39 miss:0] ; val:"Business" prob:[0, 0.259259, 0.740741, 0] | | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | | └─(neg)─ "token" is in { current} [s:0.24493 n:15 np:1 miss:0] ; val:"World" prob:[0, 0.933333, 0.0666667, 0] | | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ "token" is in { than, months, airline, minutes, each, Financial} [s:0.343602 n:77 np:16 miss:0] ; val:"Sci/Tech" prob:[0.623377, 0.0909091, 0.285714, 0] | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ "token" is in { NEW, J, filed, 500, Kong} [s:0.313108 n:61 np:13 miss:0] ; val:"Sci/Tech" prob:[0.786885, 0.114754, 0.0983607, 0] | | | | ├─(pos)─ "token" is in { for, Thursday, half} [s:0.666278 n:13 np:8 miss:0] ; val:"Business" prob:[0.153846, 0.384615, 0.461538, 0] | | | | | ├─(pos)─ "token" is in { 2004, 7} [s:0.323642 n:8 np:5 miss:0] ; val:"Business" prob:[0.25, 0, 0.75, 0] | | | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | | | └─(neg)─ val:"Sci/Tech" prob:[0.666667, 0, 0.333333, 0] | | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ "token" is in {L} [s:0.133423 n:48 np:3 miss:0] ; val:"Sci/Tech" prob:[0.958333, 0.0416667, 0, 0] | | | | ├─(pos)─ val:"World" prob:[0.333333, 0.666667, 0, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { long, el, chip} [s:0.0341181 n:281 np:28 miss:0] ; val:"Business" prob:[0.0142349, 0.00355872, 0.982206, 0] | | | ├─(pos)─ "token" is in { will} [s:0.239389 n:28 np:7 miss:0] ; val:"Business" prob:[0.142857, 0, 0.857143, 0] | | | | ├─(pos)─ "token" is in { on, )} [s:0.682908 n:7 np:4 miss:0] ; val:"Sci/Tech" prob:[0.571429, 0, 0.428571, 0] | | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in { bank} [s:0.0112943 n:253 np:15 miss:0] ; val:"Business" prob:[0, 0.00395257, 0.996047, 0] | | | ├─(pos)─ "token" is in { Aug} [s:0.24493 n:15 np:1 miss:0] ; val:"Business" prob:[0, 0.0666667, 0.933333, 0] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { were, ", J, take, past, et, start, cost, quarterly, president, ...[15 left]} [s:0.315295 n:196 np:77 miss:0] ; val:"Sci/Tech" prob:[0.515306, 0.0102041, 0.438776, 0.0357143] | | ├─(pos)─ "token" is in { national, Shrine, broadband} [s:0.164628 n:77 np:3 miss:0] ; val:"Sci/Tech" prob:[0.961039, 0.012987, 0.025974, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0.333333, 0.666667, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in {a, help, much, music, ant, chip, web, try, source} [s:0.282532 n:119 np:21 miss:0] ; val:"Business" prob:[0.226891, 0.00840336, 0.705882, 0.0588235] | | ├─(pos)─ "token" is in { engine, fifth} [s:0.314492 n:21 np:2 miss:0] ; val:"Sci/Tech" prob:[0.904762, 0.047619, 0.047619, 0] | | | ├─(pos)─ val:"World" prob:[0, 0.5, 0.5, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { is, icker, With, discovered, soccer} [s:0.195934 n:98 np:22 miss:0] ; val:"Business" prob:[0.0816327, 0, 0.846939, 0.0714286] | | ├─(pos)─ "token" is in { his} [s:0.114278 n:22 np:2 miss:0] ; val:"Business" prob:[0.272727, 0, 0.409091, 0.318182] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ "token" is in { nearly, Talks} [s:0.254456 n:20 np:3 miss:0] ; val:"Business" prob:[0.3, 0, 0.45, 0.25] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ "token" is in { double} [s:0.362211 n:17 np:2 miss:0] ; val:"Business" prob:[0.352941, 0, 0.529412, 0.117647] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ "token" is in {), T, what, biggest} [s:0.673012 n:15 np:6 miss:0] ; val:"Business" prob:[0.4, 0, 0.6, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { When} [s:0.0518136 n:76 np:1 miss:0] ; val:"Business" prob:[0.0263158, 0, 0.973684, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { contract} [s:0.04535 n:75 np:3 miss:0] ; val:"Business" prob:[0.0133333, 0, 0.986667, 0] | | ├─(pos)─ val:"Business" prob:[0.333333, 0, 0.666667, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in {gt, just, computer, scientists, low, Net, researchers, hopes, 151, SP, ...[25 left]} [s:0.130152 n:1201 np:337 miss:0] ; val:"Sci/Tech" prob:[0.652789, 0.062448, 0.233972, 0.050791] | ├─(pos)─ "token" is in { win, year, 16} [s:0.0615209 n:337 np:8 miss:0] ; val:"Sci/Tech" prob:[0.982196, 0.00296736, 0, 0.0148368] | | ├─(pos)─ "token" is in {,} [s:0.323642 n:8 np:6 miss:0] ; val:"Sports" prob:[0.375, 0, 0, 0.625] | | | ├─(pos)─ "token" is in { will} [s:0.450561 n:6 np:1 miss:0] ; val:"Sports" prob:[0.166667, 0, 0, 0.833333] | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { Sunday} [s:0.0164384 n:329 np:2 miss:0] ; val:"Sci/Tech" prob:[0.99696, 0.00303951, 0, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0.5, 0, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { -, i, ', still, Ex, Down, ,", Palestinian} [s:0.0592067 n:864 np:258 miss:0] ; val:"Sci/Tech" prob:[0.524306, 0.0856481, 0.325231, 0.0648148] | ├─(pos)─ "token" is in {Reuters, are, companies, 6, key, consumer, TA, Bloomberg, June, ink, ...[1 left]} [s:0.171246 n:258 np:110 miss:0] ; val:"Sci/Tech" prob:[0.472868, 0.236434, 0.25969, 0.0310078] | | ├─(pos)─ "token" is in { after, down, found, League, 8, Exchange, De, IP, p, ES, ...[4 left]} [s:0.294623 n:110 np:38 miss:0] ; val:"Business" prob:[0.390909, 0.0636364, 0.527273, 0.0181818] | | | ├─(pos)─ "token" is in { United} [s:0.085211 n:38 np:1 miss:0] ; val:"Business" prob:[0, 0, 0.947368, 0.0526316] | | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | | └─(neg)─ "token" is in { North} [s:0.124251 n:37 np:1 miss:0] ; val:"Business" prob:[0, 0, 0.972973, 0.027027] | | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in { year, other, M, under, retail, Price, chain, Wi, popular, ile, ...[1 left]} [s:0.674122 n:72 np:43 miss:0] ; val:"Sci/Tech" prob:[0.597222, 0.0972222, 0.305556, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { Minister, shrine, Palestinian} [s:0.23635 n:29 np:4 miss:0] ; val:"Business" prob:[0, 0.241379, 0.758621, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in {11} [s:0.202388 n:25 np:2 miss:0] ; val:"Business" prob:[0, 0.12, 0.88, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { no} [s:0.178845 n:23 np:1 miss:0] ; val:"Business" prob:[0, 0.0434783, 0.956522, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { his, would, r, way, days, six, seven, z, 15, Department, ...[8 left]} [s:0.374713 n:148 np:44 miss:0] ; val:"Sci/Tech" prob:[0.533784, 0.364865, 0.0608108, 0.0405405] | | ├─(pos)─ "token" is in { five, took, without} [s:0.172432 n:44 np:3 miss:0] ; val:"World" prob:[0, 0.886364, 0, 0.113636] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ "token" is in { It, off} [s:0.194909 n:41 np:2 miss:0] ; val:"World" prob:[0, 0.95122, 0, 0.0487805] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { who, early, vote, Earth, nuclear, manager} [s:0.21948 n:104 np:10 miss:0] ; val:"Sci/Tech" prob:[0.759615, 0.144231, 0.0865385, 0.00961538] | | ├─(pos)─ "token" is in {39} [s:0.325083 n:10 np:1 miss:0] ; val:"World" prob:[0, 0.9, 0, 0.1] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { NEW, time, police, Some} [s:0.15503 n:94 np:7 miss:0] ; val:"Sci/Tech" prob:[0.840426, 0.0638298, 0.0957447, 0] | | ├─(pos)─ "token" is in { first, offering, drop} [s:0.682908 n:7 np:4 miss:0] ; val:"Business" prob:[0, 0.428571, 0.571429, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { no, what, recently} [s:0.203438 n:87 np:9 miss:0] ; val:"Sci/Tech" prob:[0.908046, 0.0344828, 0.0574713, 0] | | ├─(pos)─ "token" is in {;, was, new} [s:0.686962 n:9 np:5 miss:0] ; val:"Business" prob:[0.222222, 0.222222, 0.555556, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[0.5, 0.5, 0, 0] | | └─(neg)─ "token" is in {largest} [s:0.0685932 n:78 np:1 miss:0] ; val:"Sci/Tech" prob:[0.987179, 0.0128205, 0, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { H, US, Games, court, British, past, official, attack, shrine, at, ...[9 left]} [s:0.177346 n:606 np:85 miss:0] ; val:"Sci/Tech" prob:[0.546205, 0.0214521, 0.353135, 0.0792079] | ├─(pos)─ "token" is in { more, were, could, may, federal, capital, 200, Department, signed, staff, ...[4 left]} [s:0.506976 n:85 np:29 miss:0] ; val:"Sports" prob:[0.152941, 0.129412, 0.211765, 0.505882] | | ├─(pos)─ "token" is in { ..., high, city, services, ster} [s:0.566219 n:29 np:13 miss:0] ; val:"Business" prob:[0.448276, 0.0344828, 0.517241, 0] | | | ├─(pos)─ "token" is in { the, ), 39} [s:0.271189 n:13 np:12 miss:0] ; val:"Sci/Tech" prob:[0.923077, 0.0769231, 0, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { copyright} [s:0.147148 n:16 np:2 miss:0] ; val:"Business" prob:[0.0625, 0, 0.9375, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { open, accused, French, storm} [s:0.222491 n:56 np:7 miss:0] ; val:"Sports" prob:[0, 0.178571, 0.0535714, 0.767857] | | ├─(pos)─ "token" is in { of} [s:0.410116 n:7 np:6 miss:0] ; val:"World" prob:[0, 0.857143, 0.142857, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in {AP, most, More} [s:0.283639 n:49 np:5 miss:0] ; val:"Sports" prob:[0, 0.0816327, 0.0408163, 0.877551] | | ├─(pos)─ "token" is in { company} [s:0.673012 n:5 np:2 miss:0] ; val:"World" prob:[0, 0.6, 0.4, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { new} [s:0.108471 n:44 np:1 miss:0] ; val:"Sports" prob:[0, 0.0227273, 0, 0.977273] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ "token" is in { Microsoft, set, Quote, took, eight, Microsoft, up, Paul, San, UK, ...[24 left]} [s:0.122294 n:521 np:117 miss:0] ; val:"Sci/Tech" prob:[0.610365, 0.00383877, 0.3762, 0.00959693] | ├─(pos)─ "token" is in {ar} [s:0.0865031 n:117 np:2 miss:0] ; val:"Sci/Tech" prob:[0.974359, 0.017094, 0.00854701, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { earnings} [s:0.049918 n:115 np:1 miss:0] ; val:"Sci/Tech" prob:[0.991304, 0, 0.00869565, 0] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { #, United, Aug, when, or, other, T, shares, plan, o, ...[20 left]} [s:0.180217 n:404 np:218 miss:0] ; val:"Sci/Tech" prob:[0.50495, 0, 0.482673, 0.0123762] | ├─(pos)─ "token" is in {ing, P, K, high, or, 000, South, maker, latest, any, ...[5 left]} [s:0.130709 n:218 np:50 miss:0] ; val:"Business" prob:[0.252294, 0, 0.747706, 0] | | ├─(pos)─ "token" is in { years, giant, ant, history} [s:0.185663 n:50 np:7 miss:0] ; val:"Sci/Tech" prob:[0.68, 0, 0.32, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in {,, C, down, offers} [s:0.3334 n:43 np:36 miss:0] ; val:"Sci/Tech" prob:[0.790698, 0, 0.209302, 0] | | | ├─(pos)─ "token" is in { Inc, suspended} [s:0.0821326 n:36 np:9 miss:0] ; val:"Sci/Tech" prob:[0.944444, 0, 0.0555556, 0] | | | | ├─(pos)─ "token" is in { unit} [s:0.155811 n:9 np:5 miss:0] ; val:"Sci/Tech" prob:[0.777778, 0, 0.222222, 0] | | | | | ├─(pos)─ "token" is in {s} [s:0.223144 n:5 np:4 miss:0] ; val:"Sci/Tech" prob:[0.6, 0, 0.4, 0] | | | | | | ├─(pos)─ val:"Sci/Tech" prob:[0.75, 0, 0.25, 0] | | | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { #, from, four, star} [s:0.0873229 n:168 np:89 miss:0] ; val:"Business" prob:[0.125, 0, 0.875, 0] | | ├─(pos)─ "token" is in { business, long, being, under, al, released, space} [s:0.299613 n:89 np:17 miss:0] ; val:"Business" prob:[0.235955, 0, 0.764045, 0] | | | ├─(pos)─ "token" is in { signs} [s:0.223718 n:17 np:1 miss:0] ; val:"Sci/Tech" prob:[0.941176, 0, 0.0588235, 0] | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { chip, Mac} [s:0.103641 n:72 np:8 miss:0] ; val:"Business" prob:[0.0694444, 0, 0.930556, 0] | | | ├─(pos)─ "token" is in {,} [s:0.215762 n:8 np:6 miss:0] ; val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | | ├─(pos)─ "token" is in { development} [s:0.219512 n:6 np:1 miss:0] ; val:"Business" prob:[0.333333, 0, 0.666667, 0] | | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | | └─(neg)─ val:"Business" prob:[0.2, 0, 0.8, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in {4} [s:0.0804848 n:64 np:1 miss:0] ; val:"Business" prob:[0.015625, 0, 0.984375, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { first, prices, er, National, loss, e, fourth, Update, making, engine, ...[4 left]} [s:0.21351 n:186 np:36 miss:0] ; val:"Sci/Tech" prob:[0.801075, 0, 0.172043, 0.0268817] | ├─(pos)─ "token" is in { night, contract} [s:0.224354 n:36 np:3 miss:0] ; val:"Business" prob:[0.222222, 0, 0.666667, 0.111111] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in { this, D, growth, Posts} [s:0.337816 n:33 np:6 miss:0] ; val:"Business" prob:[0.242424, 0, 0.727273, 0.030303] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { two, federal, tech} [s:0.2242 n:27 np:5 miss:0] ; val:"Business" prob:[0.0740741, 0, 0.888889, 0.037037] | | ├─(pos)─ "token" is in { stocks} [s:0.673012 n:5 np:2 miss:0] ; val:"Sci/Tech" prob:[0.4, 0, 0.4, 0.2] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[0.666667, 0, 0, 0.333333] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { 2004} [s:0.0221463 n:150 np:9 miss:0] ; val:"Sci/Tech" prob:[0.94, 0, 0.0533333, 0.00666667] | ├─(pos)─ "token" is in {-} [s:0.348832 n:9 np:1 miss:0] ; val:"Sci/Tech" prob:[0.888889, 0, 0, 0.111111] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in {-, for, this, through, growth, on, users, chip, using} [s:0.0614785 n:141 np:122 miss:0] ; val:"Sci/Tech" prob:[0.943262, 0, 0.0567376, 0] | ├─(pos)─ "token" is in { New, Q} [s:0.0609236 n:122 np:4 miss:0] ; val:"Sci/Tech" prob:[0.983607, 0, 0.0163934, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { back, including, Hurricane} [s:0.313649 n:19 np:4 miss:0] ; val:"Sci/Tech" prob:[0.684211, 0, 0.315789, 0] | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in {m} [s:0.15251 n:15 np:1 miss:0] ; val:"Sci/Tech" prob:[0.866667, 0, 0.133333, 0] | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { having} [s:0.257319 n:14 np:1 miss:0] ; val:"Sci/Tech" prob:[0.928571, 0, 0.0714286, 0] | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] └─(neg)─ "token" is in { country, Minister, Bush, ite, attack, political, Shiite, militia, election, 14, ...[18 left]} [s:0.118711 n:6176 np:1240 miss:0] ; val:"World" prob:[0.265544, 0.355084, 0.173899, 0.205473] ├─(pos)─ "token" is in { Olympic, maker, quarter, music, use, costs, 5, os, wins, showed, ...[20 left]} [s:0.10229 n:1240 np:119 miss:0] ; val:"World" prob:[0.0653226, 0.820161, 0.0717742, 0.0427419] | ├─(pos)─ "token" is in { yesterday, medal, Games, R, No, round, 5} [s:0.387025 n:119 np:24 miss:0] ; val:"Sci/Tech" prob:[0.310924, 0.201681, 0.285714, 0.201681] | | ├─(pos)─ "token" is in { Israeli} [s:0.286836 n:24 np:2 miss:0] ; val:"Sports" prob:[0, 0.0833333, 0, 0.916667] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in { G, plans, International, attack, European, customers, Real, space, the, storm} [s:0.458909 n:95 np:31 miss:0] ; val:"Sci/Tech" prob:[0.389474, 0.231579, 0.357895, 0.0210526] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { world, ing, Web, Not, Top} [s:0.264038 n:64 np:8 miss:0] ; val:"Business" prob:[0.09375, 0.34375, 0.53125, 0.03125] | | ├─(pos)─ "token" is in { time, between, Wal} [s:0.661563 n:8 np:3 miss:0] ; val:"Sci/Tech" prob:[0.625, 0, 0.125, 0.25] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0.333333, 0.666667] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { States, G, officials, other, York, John, use, led} [s:0.323605 n:56 np:15 miss:0] ; val:"Business" prob:[0.0178571, 0.392857, 0.589286, 0] | | ├─(pos)─ "token" is in { urged} [s:0.24493 n:15 np:1 miss:0] ; val:"World" prob:[0.0666667, 0.933333, 0, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { S, information} [s:0.132679 n:41 np:3 miss:0] ; val:"Business" prob:[0, 0.195122, 0.804878, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { out, been, af} [s:0.389377 n:38 np:5 miss:0] ; val:"Business" prob:[0, 0.131579, 0.868421, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { share, Shares, far, CH, Network, free, running, ou, continue, corporate, ...[6 left]} [s:0.0682578 n:1121 np:70 miss:0] ; val:"World" prob:[0.0392507, 0.885816, 0.0490633, 0.0258698] | ├─(pos)─ "token" is in { over, I, is, under, used, economy, NASA, agreement, st} [s:0.621662 n:70 np:29 miss:0] ; val:"World" prob:[0.285714, 0.328571, 0.142857, 0.242857] | | ├─(pos)─ "token" is in { #, up, presidential} [s:0.311732 n:29 np:9 miss:0] ; val:"Sci/Tech" prob:[0.655172, 0, 0.344828, 0] | | | ├─(pos)─ "token" is in { are} [s:0.348832 n:9 np:1 miss:0] ; val:"Business" prob:[0.111111, 0, 0.888889, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in { sell} [s:0.129201 n:20 np:1 miss:0] ; val:"Sci/Tech" prob:[0.9, 0, 0.1, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in { before} [s:0.206192 n:19 np:1 miss:0] ; val:"Sci/Tech" prob:[0.947368, 0, 0.0526316, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in {'s, Tuesday, last, r, en, open, ph} [s:0.59014 n:41 np:22 miss:0] ; val:"World" prob:[0.0243902, 0.560976, 0, 0.414634] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { more, 11} [s:0.336496 n:19 np:2 miss:0] ; val:"Sports" prob:[0.0526316, 0.0526316, 0, 0.894737] | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0.5, 0, 0] | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ "token" is in { night, Street, ton, FR, available, highs, ter, Del, cell, onductor, ...[3 left]} [s:0.0667929 n:1051 np:32 miss:0] ; val:"World" prob:[0.0228354, 0.922931, 0.0428164, 0.0114177] | ├─(pos)─ "token" is in { #, President} [s:0.512934 n:32 np:18 miss:0] ; val:"Business" prob:[0.25, 0.09375, 0.5, 0.15625] | | ├─(pos)─ "token" is in { at, Del} [s:0.348832 n:18 np:2 miss:0] ; val:"Business" prob:[0.0555556, 0, 0.888889, 0.0555556] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0, 0.5] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in {AP} [s:0.51958 n:14 np:3 miss:0] ; val:"Sci/Tech" prob:[0.5, 0.214286, 0, 0.285714] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { victory, strong, believe} [s:0.655482 n:11 np:4 miss:0] ; val:"Sci/Tech" prob:[0.636364, 0, 0, 0.363636] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in {lt, victory, now, biggest, He, ina, This, and, popular, ches, ...[2 left]} [s:0.0418916 n:1019 np:69 miss:0] ; val:"World" prob:[0.0157017, 0.94897, 0.0284593, 0.00686948] | ├─(pos)─ "token" is in { The, G, face, Bush, THE} [s:0.325999 n:69 np:18 miss:0] ; val:"World" prob:[0.115942, 0.637681, 0.246377, 0] | | ├─(pos)─ "token" is in { government, now} [s:0.325598 n:18 np:4 miss:0] ; val:"Business" prob:[0, 0.166667, 0.833333, 0] | | | ├─(pos)─ val:"World" prob:[0, 0.75, 0.25, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { Ham} [s:0.105204 n:51 np:6 miss:0] ; val:"World" prob:[0.156863, 0.803922, 0.0392157, 0] | | ├─(pos)─ "token" is in { the} [s:0.636514 n:6 np:4 miss:0] ; val:"World" prob:[0, 0.666667, 0.333333, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in {A, over} [s:0.231221 n:45 np:5 miss:0] ; val:"World" prob:[0.177778, 0.822222, 0, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { when, like} [s:0.103714 n:40 np:3 miss:0] ; val:"World" prob:[0.075, 0.925, 0, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.666667, 0.333333, 0, 0] | | └─(neg)─ "token" is in { its} [s:0.0634578 n:37 np:4 miss:0] ; val:"World" prob:[0.027027, 0.972973, 0, 0] | | ├─(pos)─ val:"World" prob:[0.25, 0.75, 0, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in {Microsoft, research, silver} [s:0.0280381 n:950 np:6 miss:0] ; val:"World" prob:[0.00842105, 0.971579, 0.0126316, 0.00736842] | ├─(pos)─ "token" is in { silver} [s:0.636514 n:6 np:2 miss:0] ; val:"Sci/Tech" prob:[0.666667, 0, 0, 0.333333] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { year, Olympics, won, 4, International, data, nation, again, away, seconds, ...[1 left]} [s:0.0382867 n:944 np:68 miss:0] ; val:"World" prob:[0.00423729, 0.977754, 0.0127119, 0.00529661] | ├─(pos)─ "token" is in { not, F, make, D, Update} [s:0.251138 n:68 np:8 miss:0] ; val:"World" prob:[0.0588235, 0.764706, 0.102941, 0.0735294] | | ├─(pos)─ "token" is in { will, E} [s:0.661563 n:8 np:3 miss:0] ; val:"Business" prob:[0, 0, 0.625, 0.375] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { out, her, IT} [s:0.149629 n:60 np:7 miss:0] ; val:"World" prob:[0.0666667, 0.866667, 0.0333333, 0.0333333] | | ├─(pos)─ "token" is in { has, over} [s:0.682908 n:7 np:3 miss:0] ; val:"Sci/Tech" prob:[0.285714, 0.285714, 0.285714, 0.142857] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.666667, 0, 0, 0.333333] | | | └─(neg)─ val:"World" prob:[0, 0.5, 0.5, 0] | | └─(neg)─ "token" is in { Olympics, 3, storm} [s:0.11766 n:53 np:8 miss:0] ; val:"World" prob:[0.0377358, 0.943396, 0, 0.0188679] | | ├─(pos)─ "token" is in { Olympics} [s:0.240931 n:8 np:3 miss:0] ; val:"World" prob:[0.25, 0.625, 0, 0.125] | | | ├─(pos)─ val:"World" prob:[0, 0.666667, 0, 0.333333] | | | └─(neg)─ "token" is in {)} [s:0.673012 n:5 np:2 miss:0] ; val:"World" prob:[0.4, 0.6, 0, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in {ians} [s:0.00849586 n:876 np:5 miss:0] ; val:"World" prob:[0, 0.994292, 0.00570776, 0] | ├─(pos)─ "token" is in {ers} [s:0.673012 n:5 np:2 miss:0] ; val:"World" prob:[0, 0.6, 0.4, 0] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { team, results, president} [s:0.0120252 n:871 np:28 miss:0] ; val:"World" prob:[0, 0.996556, 0.00344432, 0] | ├─(pos)─ "token" is in { United, a, ary} [s:0.3405 n:28 np:3 miss:0] ; val:"World" prob:[0, 0.892857, 0.107143, 0] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ val:"World" prob:[0, 1, 0, 0] └─(neg)─ "token" is in { ", military, war, Canadian, minister, violence, China, authorities, launch, Democratic, ...[8 left]} [s:0.0654585 n:4936 np:612 miss:0] ; val:"Sci/Tech" prob:[0.315843, 0.23825, 0.199554, 0.246353] ├─(pos)─ "token" is in { was, i, officials, top, city, John, National, reports, saying, Federal, ...[22 left]} [s:0.16664 n:612 np:290 miss:0] ; val:"World" prob:[0.240196, 0.655229, 0.0653595, 0.0392157] | ├─(pos)─ "token" is in { As, He, Wins, England, First, t, fight, OL, Kh} [s:0.144584 n:290 np:18 miss:0] ; val:"World" prob:[0.0310345, 0.917241, 0.0172414, 0.0344828] | | ├─(pos)─ "token" is in {ed, U, C, figures} [s:0.590842 n:18 np:5 miss:0] ; val:"Sports" prob:[0.166667, 0.166667, 0.111111, 0.555556] | | | ├─(pos)─ "token" is in { at, Sunday} [s:0.673012 n:5 np:3 miss:0] ; val:"Sci/Tech" prob:[0.6, 0, 0.4, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in {mer} [s:0.282435 n:13 np:2 miss:0] ; val:"Sports" prob:[0, 0.230769, 0, 0.769231] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { but, season, competition, put} [s:0.178609 n:11 np:9 miss:0] ; val:"Sports" prob:[0, 0.0909091, 0, 0.909091] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 0.5, 0, 0.5] | | └─(neg)─ "token" is in { her, next, financial, trade, 2005, times} [s:0.0796668 n:272 np:17 miss:0] ; val:"World" prob:[0.0220588, 0.966912, 0.0110294, 0] | | ├─(pos)─ "token" is in {AP, online, to} [s:0.605797 n:17 np:5 miss:0] ; val:"World" prob:[0.294118, 0.529412, 0.176471, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { said} [s:0.215762 n:12 np:6 miss:0] ; val:"World" prob:[0, 0.75, 0.25, 0] | | | ├─(pos)─ "token" is in { and, but} [s:0.693147 n:6 np:3 miss:0] ; val:"World" prob:[0, 0.5, 0.5, 0] | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in {;, as, A, last, been, ed, or, B, four, M, ...[26 left]} [s:0.00832007 n:255 np:224 miss:0] ; val:"World" prob:[0.00392157, 0.996078, 0, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { HP} [s:0.142506 n:31 np:1 miss:0] ; val:"World" prob:[0.0322581, 0.967742, 0, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { left, even, out, head, Update, Airways, legal, never, management, bankruptcy, ...[2 left]} [s:0.159365 n:322 np:35 miss:0] ; val:"Sci/Tech" prob:[0.428571, 0.419255, 0.108696, 0.0434783] | ├─(pos)─ "token" is in { The, be, into} [s:0.259089 n:35 np:14 miss:0] ; val:"Business" prob:[0.0571429, 0.114286, 0.685714, 0.142857] | | ├─(pos)─ "token" is in { won, Now} [s:0.419554 n:14 np:4 miss:0] ; val:"Business" prob:[0.142857, 0, 0.5, 0.357143] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ "token" is in { to} [s:0.500402 n:10 np:8 miss:0] ; val:"Business" prob:[0.2, 0, 0.7, 0.1] | | | ├─(pos)─ "token" is in { his} [s:0.37677 n:8 np:1 miss:0] ; val:"Business" prob:[0, 0, 0.875, 0.125] | | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { Iraq, legal, Ch} [s:0.486913 n:21 np:4 miss:0] ; val:"Business" prob:[0, 0.190476, 0.809524, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { Iraq, least, an, N, talks, face, former, AD, Washington, Israeli, ...[7 left]} [s:0.292883 n:287 np:89 miss:0] ; val:"Sci/Tech" prob:[0.473868, 0.456446, 0.0383275, 0.0313589] | ├─(pos)─ "token" is in { oil, Oil, role} [s:0.246902 n:89 np:6 miss:0] ; val:"World" prob:[0, 0.932584, 0.0674157, 0] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { AP, through, now, released, champion, shrine, Y, customers, coach, trial, ...[7 left]} [s:0.210026 n:198 np:50 miss:0] ; val:"Sci/Tech" prob:[0.686869, 0.242424, 0.0252525, 0.0454545] | ├─(pos)─ "token" is in { 1, news, Greek, match} [s:0.257923 n:50 np:7 miss:0] ; val:"World" prob:[0.2, 0.66, 0, 0.14] | | ├─(pos)─ "token" is in { Che} [s:0.410116 n:7 np:1 miss:0] ; val:"Sports" prob:[0, 0.142857, 0, 0.857143] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in { would, like, Sp, v, Un, Online, claim, Dallas} [s:0.396658 n:43 np:9 miss:0] ; val:"World" prob:[0.232558, 0.744186, 0, 0.0232558] | | ├─(pos)─ "token" is in { said, world, or, just, like, violence} [s:0.194799 n:9 np:7 miss:0] ; val:"Sci/Tech" prob:[0.888889, 0, 0, 0.111111] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[0.5, 0, 0, 0.5] | | └─(neg)─ "token" is in {s, US, new, Wednesday, one, when, phone, X, charged, urged} [s:0.167555 n:34 np:31 miss:0] ; val:"World" prob:[0.0588235, 0.941176, 0, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Sci/Tech" prob:[0.666667, 0.333333, 0, 0] | └─(neg)─ "token" is in { night, or, M, many, economy, Press, president, retail, chairman} [s:0.181087 n:148 np:12 miss:0] ; val:"Sci/Tech" prob:[0.851351, 0.101351, 0.0337838, 0.0135135] | ├─(pos)─ "token" is in { United, 3, down} [s:0.562335 n:12 np:3 miss:0] ; val:"World" prob:[0, 0.75, 0.166667, 0.0833333] | | ├─(pos)─ val:"Business" prob:[0, 0, 0.666667, 0.333333] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { there, contract, Dar, hopes} [s:0.0608936 n:136 np:9 miss:0] ; val:"Sci/Tech" prob:[0.926471, 0.0441176, 0.0220588, 0.00735294] | ├─(pos)─ "token" is in { world, gt, By} [s:0.686962 n:9 np:5 miss:0] ; val:"Sci/Tech" prob:[0.555556, 0.444444, 0, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { sales, AFP, major, thousands, post} [s:0.145212 n:127 np:9 miss:0] ; val:"Sci/Tech" prob:[0.952756, 0.015748, 0.023622, 0.00787402] | ├─(pos)─ "token" is in {,} [s:0.636514 n:9 np:6 miss:0] ; val:"Sci/Tech" prob:[0.333333, 0.222222, 0.333333, 0.111111] | | ├─(pos)─ "token" is in { that, could} [s:0.693147 n:6 np:3 miss:0] ; val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ val:"World" prob:[0, 0.666667, 0, 0.333333] | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] └─(neg)─ "token" is in { its, million, service, price, Quote, buy, pay, users, do, economy, ...[49 left]} [s:0.129061 n:4324 np:1559 miss:0] ; val:"Sci/Tech" prob:[0.326549, 0.179232, 0.218548, 0.275671] ├─(pos)─ "token" is in { AP, com, South, u, d, bid, Wins, est, San, SAN, ...[35 left]} [s:0.129658 n:1559 np:490 miss:0] ; val:"Sci/Tech" prob:[0.434894, 0.0994227, 0.41052, 0.0551636] | ├─(pos)─ "token" is in { Saturday, before, reported, buy, ONDON, forces, West, games, head, 11, ...[15 left]} [s:0.182213 n:490 np:82 miss:0] ; val:"Sci/Tech" prob:[0.728571, 0.146939, 0.0755102, 0.0489796] | | ├─(pos)─ "token" is in { gold, game, another, W, small, Texas} [s:0.338648 n:82 np:15 miss:0] ; val:"World" prob:[0.195122, 0.536585, 0.0243902, 0.243902] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ "token" is in { had, can, The, return, band, picture} [s:0.267859 n:67 np:11 miss:0] ; val:"World" prob:[0.238806, 0.656716, 0.0298507, 0.0746269] | | | ├─(pos)─ "token" is in { may} [s:0.178609 n:11 np:2 miss:0] ; val:"Sci/Tech" prob:[0.909091, 0, 0.0909091, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { largest, companies, An, XP, comes} [s:0.324889 n:56 np:8 miss:0] ; val:"World" prob:[0.107143, 0.785714, 0.0178571, 0.0892857] | | | ├─(pos)─ "token" is in { use} [s:0.37677 n:8 np:1 miss:0] ; val:"Sci/Tech" prob:[0.75, 0.125, 0.125, 0] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ "token" is in {ers} [s:0.410116 n:7 np:1 miss:0] ; val:"Sci/Tech" prob:[0.857143, 0, 0.142857, 0] | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { H, offers, ations} [s:0.163686 n:48 np:3 miss:0] ; val:"World" prob:[0, 0.895833, 0, 0.104167] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ "token" is in { two, field} [s:0.0969517 n:45 np:6 miss:0] ; val:"World" prob:[0, 0.955556, 0, 0.0444444] | | | ├─(pos)─ "token" is in { as, supplies} [s:0.636514 n:6 np:4 miss:0] ; val:"World" prob:[0, 0.666667, 0, 0.333333] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { final, J, as, ant, play, lost, citing, systems, f, effort, ...[3 left]} [s:0.135671 n:408 np:50 miss:0] ; val:"Sci/Tech" prob:[0.835784, 0.0686275, 0.0857843, 0.00980392] | | ├─(pos)─ "token" is in { last, C, only, big, following, upgrade} [s:0.251652 n:50 np:11 miss:0] ; val:"Business" prob:[0.4, 0, 0.56, 0.04] | | | ├─(pos)─ "token" is in { K, next} [s:0.474139 n:11 np:2 miss:0] ; val:"Sci/Tech" prob:[0.818182, 0, 0, 0.181818] | | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { the, stock, E} [s:0.157531 n:39 np:27 miss:0] ; val:"Business" prob:[0.282051, 0, 0.717949, 0] | | | ├─(pos)─ "token" is in { with, :, You} [s:0.194799 n:27 np:6 miss:0] ; val:"Business" prob:[0.111111, 0, 0.888889, 0] | | | | ├─(pos)─ "token" is in {;, late} [s:0.693147 n:6 np:3 miss:0] ; val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in {), 2} [s:0.428013 n:12 np:7 miss:0] ; val:"Sci/Tech" prob:[0.666667, 0, 0.333333, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { yesterday} [s:0.118494 n:5 np:2 miss:0] ; val:"Business" prob:[0.2, 0, 0.8, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Business" prob:[0.333333, 0, 0.666667, 0] | | └─(neg)─ "token" is in { AFP, ist, nation, Hurricane, bank, 8, file, Bloomberg, IA, body, ...[3 left]} [s:0.132309 n:358 np:24 miss:0] ; val:"Sci/Tech" prob:[0.896648, 0.0782123, 0.0195531, 0.00558659] | | ├─(pos)─ "token" is in { Friday, against, record, system, Japanese, Korea, 8, having} [s:0.679193 n:24 np:10 miss:0] ; val:"World" prob:[0.125, 0.583333, 0.25, 0.0416667] | | | ├─(pos)─ "token" is in {AP, Microsoft} [s:0.673012 n:10 np:4 miss:0] ; val:"Business" prob:[0.3, 0, 0.6, 0.1] | | | | ├─(pos)─ val:"Sci/Tech" prob:[0.75, 0, 0, 0.25] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { but, m, if, largest, now, including, warned, reach, below} [s:0.0490454 n:334 np:54 miss:0] ; val:"Sci/Tech" prob:[0.952096, 0.0419162, 0.00299401, 0.00299401] | | ├─(pos)─ "token" is in { as, OS, Court, building} [s:0.245328 n:54 np:7 miss:0] ; val:"Sci/Tech" prob:[0.777778, 0.203704, 0.0185185, 0] | | | ├─(pos)─ "token" is in { } [s:0.137325 n:7 np:3 miss:0] ; val:"World" prob:[0, 0.857143, 0.142857, 0] | | | | ├─(pos)─ val:"World" prob:[0, 0.666667, 0.333333, 0] | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { price} [s:0.0652347 n:47 np:3 miss:0] ; val:"Sci/Tech" prob:[0.893617, 0.106383, 0, 0] | | | ├─(pos)─ val:"World" prob:[0.333333, 0.666667, 0, 0] | | | └─(neg)─ "token" is in { by, went} [s:0.0835894 n:44 np:14 miss:0] ; val:"Sci/Tech" prob:[0.931818, 0.0681818, 0, 0] | | | ├─(pos)─ "token" is in { by} [s:0.120923 n:14 np:13 miss:0] ; val:"Sci/Tech" prob:[0.785714, 0.214286, 0, 0] | | | | ├─(pos)─ "token" is in { opening} [s:0.429323 n:13 np:2 miss:0] ; val:"Sci/Tech" prob:[0.846154, 0.153846, 0, 0] | | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { have, L, took} [s:0.0300261 n:280 np:36 miss:0] ; val:"Sci/Tech" prob:[0.985714, 0.0107143, 0, 0.00357143] | | ├─(pos)─ "token" is in { state, once} [s:0.195127 n:36 np:5 miss:0] ; val:"Sci/Tech" prob:[0.888889, 0.0833333, 0, 0.0277778] | | | ├─(pos)─ "token" is in { to} [s:0.673012 n:5 np:3 miss:0] ; val:"World" prob:[0.4, 0.6, 0, 0] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in { first} [s:0.142506 n:31 np:1 miss:0] ; val:"Sci/Tech" prob:[0.967742, 0, 0, 0.0322581] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { -, United, today, may, other, 5, St, National, 19, ant, ...[38 left]} [s:0.110437 n:1069 np:606 miss:0] ; val:"Business" prob:[0.300281, 0.0776427, 0.564079, 0.0579981] | ├─(pos)─ "token" is in {af, killed, support, bronze, German, alleged, staff, Still} [s:0.0938253 n:606 np:28 miss:0] ; val:"Business" prob:[0.160066, 0.117162, 0.711221, 0.0115512] | | ├─(pos)─ "token" is in { gold, R} [s:0.257319 n:28 np:2 miss:0] ; val:"World" prob:[0, 0.928571, 0.0357143, 0.0357143] | | | ├─(pos)─ val:"Business" prob:[0, 0, 0.5, 0.5] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in {'s, public, K, top, help, now, much, industry, Y, games, ...[19 left]} [s:0.119999 n:578 np:199 miss:0] ; val:"Business" prob:[0.16782, 0.0778547, 0.743945, 0.0103806] | | ├─(pos)─ "token" is in { AFP, now, 100, America, 23, national, evidence, Russia, finally, Calif, ...[4 left]} [s:0.154461 n:199 np:56 miss:0] ; val:"Business" prob:[0.346734, 0.180905, 0.447236, 0.0251256] | | | ├─(pos)─ "token" is in {-, , set, service, offer, businesses} [s:0.375836 n:56 np:36 miss:0] ; val:"World" prob:[0.196429, 0.482143, 0.232143, 0.0892857] | | | | ├─(pos)─ "token" is in {Reuters, AFP, Service, available} [s:0.636514 n:36 np:24 miss:0] ; val:"Business" prob:[0.305556, 0.194444, 0.361111, 0.138889] | | | | | ├─(pos)─ "token" is in { stock, price, e, Service, far} [s:0.689671 n:24 np:11 miss:0] ; val:"Business" prob:[0.458333, 0, 0.541667, 0] | | | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | | | └─(neg)─ "token" is in { YORK, forces, around, death} [s:0.679193 n:12 np:7 miss:0] ; val:"World" prob:[0, 0.583333, 0, 0.416667] | | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { US, were, President, women, no, in, ley, Google, get, being, ...[6 left]} [s:0.220418 n:143 np:46 miss:0] ; val:"Business" prob:[0.405594, 0.0629371, 0.531469, 0] | | | ├─(pos)─ "token" is in { off, ', largest} [s:0.150678 n:46 np:3 miss:0] ; val:"Business" prob:[0, 0.130435, 0.869565, 0] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ "token" is in {-} [s:0.0492519 n:43 np:22 miss:0] ; val:"Business" prob:[0, 0.0697674, 0.930233, 0] | | | | ├─(pos)─ "token" is in { is} [s:0.24535 n:22 np:5 miss:0] ; val:"Business" prob:[0, 0.136364, 0.863636, 0] | | | | | ├─(pos)─ "token" is in { of} [s:0.673012 n:5 np:2 miss:0] ; val:"World" prob:[0, 0.6, 0.4, 0] | | | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in { last, 3, other, Web, Windows, run, ies, computer, half, scientists, ...[12 left]} [s:0.30532 n:97 np:43 miss:0] ; val:"Sci/Tech" prob:[0.597938, 0.0309278, 0.371134, 0] | | | ├─(pos)─ "token" is in {to, 0} [s:0.188113 n:43 np:2 miss:0] ; val:"Sci/Tech" prob:[0.953488, 0.0465116, 0, 0] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ "token" is in {', year, government, i, T, price, higher, :, airline, along} [s:0.282129 n:54 np:26 miss:0] ; val:"Business" prob:[0.314815, 0.0185185, 0.666667, 0] | | | ├─(pos)─ "token" is in { out} [s:0.163024 n:26 np:1 miss:0] ; val:"Business" prob:[0, 0.0384615, 0.961538, 0] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in {year, following, further} [s:0.441673 n:28 np:9 miss:0] ; val:"Sci/Tech" prob:[0.607143, 0, 0.392857, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in { top, homes} [s:0.19057 n:19 np:4 miss:0] ; val:"Sci/Tech" prob:[0.894737, 0, 0.105263, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { Japan, Is, League, your, island, Health, building} [s:0.0556517 n:379 np:8 miss:0] ; val:"Business" prob:[0.0738786, 0.0237467, 0.899736, 0.00263852] | | ├─(pos)─ "token" is in { --} [s:0.37677 n:8 np:1 miss:0] ; val:"Sci/Tech" prob:[0.75, 0.125, 0, 0.125] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in {ate} [s:0.410116 n:7 np:1 miss:0] ; val:"Sci/Tech" prob:[0.857143, 0, 0, 0.142857] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { get, online, e, Che, via} [s:0.0611738 n:371 np:14 miss:0] ; val:"Business" prob:[0.0592992, 0.0215633, 0.919137, 0] | | ├─(pos)─ "token" is in { several} [s:0.2054 n:14 np:4 miss:0] ; val:"Sci/Tech" prob:[0.714286, 0, 0.285714, 0] | | | ├─(pos)─ val:"Business" prob:[0.25, 0, 0.75, 0] | | | └─(neg)─ "token" is in { airline} [s:0.325083 n:10 np:1 miss:0] ; val:"Sci/Tech" prob:[0.9, 0, 0.1, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in {h, E, key, break, increase, Nations, End, sports} [s:0.0594733 n:357 np:21 miss:0] ; val:"Business" prob:[0.0336134, 0.022409, 0.943978, 0] | | ├─(pos)─ "token" is in { major} [s:0.154251 n:21 np:2 miss:0] ; val:"Business" prob:[0.285714, 0.238095, 0.47619, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { three, AFP} [s:0.436162 n:19 np:3 miss:0] ; val:"Business" prob:[0.315789, 0.157895, 0.526316, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in { against, sales, report, cost, w} [s:0.661563 n:16 np:10 miss:0] ; val:"Business" prob:[0.375, 0, 0.625, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { against, European} [s:0.0243353 n:336 np:27 miss:0] ; val:"Business" prob:[0.0178571, 0.00892857, 0.973214, 0] | | ├─(pos)─ "token" is in { say, heat} [s:0.348832 n:27 np:3 miss:0] ; val:"Business" prob:[0, 0.111111, 0.888889, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { have, A} [s:0.00676636 n:309 np:19 miss:0] ; val:"Business" prob:[0.0194175, 0, 0.980583, 0] | | ├─(pos)─ "token" is in { than} [s:0.19057 n:19 np:4 miss:0] ; val:"Business" prob:[0.105263, 0, 0.894737, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { world} [s:0.00555611 n:290 np:7 miss:0] ; val:"Business" prob:[0.0137931, 0, 0.986207, 0] | | ├─(pos)─ val:"Business" prob:[0.142857, 0, 0.857143, 0] | | └─(neg)─ val:"Business" prob:[0.0106007, 0, 0.989399, 0] | └─(neg)─ "token" is in { Friday, P, 2004, Windows, news, such, games, open, far, video, ...[19 left]} [s:0.171473 n:463 np:127 miss:0] ; val:"Sci/Tech" prob:[0.483801, 0.0259179, 0.37149, 0.11879] | ├─(pos)─ "token" is in {C, longer, gain} [s:0.0921219 n:127 np:4 miss:0] ; val:"Sci/Tech" prob:[0.929134, 0, 0.0472441, 0.023622] | | ├─(pos)─ val:"Business" prob:[0, 0, 0.75, 0.25] | | └─(neg)─ "token" is in { that} [s:0.0359944 n:123 np:36 miss:0] ; val:"Sci/Tech" prob:[0.95935, 0, 0.0243902, 0.0162602] | | ├─(pos)─ "token" is in { all, k} [s:0.193362 n:36 np:5 miss:0] ; val:"Sci/Tech" prob:[0.916667, 0, 0.0833333, 0] | | | ├─(pos)─ "token" is in { as} [s:0.291103 n:5 np:3 miss:0] ; val:"Business" prob:[0.4, 0, 0.6, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[0.666667, 0, 0.333333, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { it} [s:0.0577434 n:87 np:8 miss:0] ; val:"Sci/Tech" prob:[0.977011, 0, 0, 0.0229885] | | ├─(pos)─ "token" is in { hand} [s:0.203483 n:8 np:1 miss:0] ; val:"Sci/Tech" prob:[0.75, 0, 0, 0.25] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ "token" is in { N} [s:0.410116 n:7 np:1 miss:0] ; val:"Sci/Tech" prob:[0.857143, 0, 0, 0.142857] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { two, who, yesterday, Games, take, ago, official, Japanese, possible, too, ...[9 left]} [s:0.206335 n:336 np:96 miss:0] ; val:"Business" prob:[0.315476, 0.0357143, 0.494048, 0.154762] | ├─(pos)─ "token" is in { Up, Michael, also, 30, going, Hurricane, break, chain, way, Its, ...[3 left]} [s:0.376341 n:96 np:21 miss:0] ; val:"Sports" prob:[0.125, 0.125, 0.28125, 0.46875] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { U, company, only, talks, found, there, 12, led, stop} [s:0.246409 n:75 np:19 miss:0] ; val:"Sports" prob:[0.16, 0.16, 0.08, 0.6] | | ├─(pos)─ "token" is in {ers, Is, Exchange} [s:0.472559 n:19 np:6 miss:0] ; val:"Sci/Tech" prob:[0.368421, 0.315789, 0.263158, 0.0526316] | | | ├─(pos)─ "token" is in {S} [s:0.450561 n:6 np:1 miss:0] ; val:"Business" prob:[0.166667, 0, 0.833333, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in {'s, Athens, rebels} [s:0.372503 n:13 np:5 miss:0] ; val:"Sci/Tech" prob:[0.461538, 0.461538, 0, 0.0769231] | | | ├─(pos)─ "token" is in { medals} [s:0.500402 n:5 np:1 miss:0] ; val:"World" prob:[0, 0.8, 0, 0.2] | | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ "token" is in {,, has} [s:0.562335 n:8 np:6 miss:0] ; val:"Sci/Tech" prob:[0.75, 0.25, 0, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { Aug, service, senior} [s:0.252608 n:56 np:5 miss:0] ; val:"Sports" prob:[0.0892857, 0.107143, 0.0178571, 0.785714] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { have, 13} [s:0.306366 n:51 np:7 miss:0] ; val:"Sports" prob:[0.0980392, 0.0196078, 0.0196078, 0.862745] | | ├─(pos)─ "token" is in { ', D} [s:0.59827 n:7 np:2 miss:0] ; val:"Sci/Tech" prob:[0.714286, 0, 0.142857, 0.142857] | | | ├─(pos)─ val:"Business" prob:[0, 0, 0.5, 0.5] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in {-, be, more, week, C, N, much, like, rivals} [s:0.0516074 n:44 np:39 miss:0] ; val:"Sports" prob:[0, 0.0227273, 0, 0.977273] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in { the} [s:0.500402 n:5 np:4 miss:0] ; val:"Sports" prob:[0, 0.2, 0, 0.8] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { Athens, senior} [s:0.104945 n:240 np:6 miss:0] ; val:"Business" prob:[0.391667, 0, 0.579167, 0.0291667] | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ "token" is in {U, game, run, version, several, star, update, Z, AT, st, ...[4 left]} [s:0.154119 n:234 np:34 miss:0] ; val:"Business" prob:[0.401709, 0, 0.594017, 0.0042735] | ├─(pos)─ "token" is in { big} [s:0.132691 n:34 np:1 miss:0] ; val:"Sci/Tech" prob:[0.970588, 0, 0, 0.0294118] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { Corp, what, made, online, y, move, any, researchers, When, Linux, ...[2 left]} [s:0.184159 n:200 np:33 miss:0] ; val:"Business" prob:[0.305, 0, 0.695, 0] | ├─(pos)─ "token" is in { Group, name} [s:0.228632 n:33 np:2 miss:0] ; val:"Sci/Tech" prob:[0.939394, 0, 0.0606061, 0] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { reported, data, Germany, trying, Price, launched, )., survey} [s:0.118181 n:167 np:31 miss:0] ; val:"Business" prob:[0.179641, 0, 0.820359, 0] | ├─(pos)─ "token" is in { new, NEW, TV} [s:0.0964683 n:31 np:8 miss:0] ; val:"Sci/Tech" prob:[0.612903, 0, 0.387097, 0] | | ├─(pos)─ "token" is in { , :} [s:0.323642 n:8 np:5 miss:0] ; val:"Business" prob:[0.25, 0, 0.75, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[0.666667, 0, 0.333333, 0] | | └─(neg)─ "token" is in { in} [s:0.324894 n:23 np:9 miss:0] ; val:"Sci/Tech" prob:[0.73913, 0, 0.26087, 0] | | ├─(pos)─ val:"Business" prob:[0.333333, 0, 0.666667, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { California, wants, Buy, Scientists} [s:0.149806 n:136 np:7 miss:0] ; val:"Business" prob:[0.0808824, 0, 0.919118, 0] | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { people, give} [s:0.0756416 n:129 np:4 miss:0] ; val:"Business" prob:[0.0310078, 0, 0.968992, 0] | ├─(pos)─ val:"Sci/Tech" prob:[0.75, 0, 0.25, 0] | └─(neg)─ val:"Business" prob:[0.008, 0, 0.992, 0] └─(neg)─ "token" is in { prices, report, shares, =, maker, firm, system, National, economic, Microsoft, ...[33 left]} [s:0.109532 n:2765 np:495 miss:0] ; val:"Sports" prob:[0.265461, 0.224231, 0.110307, 0.4] ├─(pos)─ "token" is in { Olympic, down, AFP, u, 100, League, Germany, asked, players, reach, ...[6 left]} [s:0.122271 n:495 np:56 miss:0] ; val:"Sci/Tech" prob:[0.565657, 0.0929293, 0.282828, 0.0585859] | ├─(pos)─ "token" is in { -, who, capital, where, president} [s:0.30094 n:56 np:29 miss:0] ; val:"World" prob:[0.232143, 0.410714, 0.0357143, 0.321429] | | ├─(pos)─ "token" is in { when, US, Apple, best} [s:0.459693 n:29 np:5 miss:0] ; val:"World" prob:[0.103448, 0.724138, 0.0689655, 0.103448] | | | ├─(pos)─ "token" is in { computer} [s:0.291103 n:5 np:2 miss:0] ; val:"Sci/Tech" prob:[0.6, 0, 0.4, 0] | | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | | └─(neg)─ val:"Business" prob:[0.333333, 0, 0.666667, 0] | | | └─(neg)─ "token" is in { one, After, Giants} [s:0.37677 n:24 np:3 miss:0] ; val:"World" prob:[0, 0.875, 0, 0.125] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { , L, South, National, number, So} [s:0.516113 n:27 np:18 miss:0] ; val:"Sports" prob:[0.37037, 0.0740741, 0, 0.555556] | | ├─(pos)─ "token" is in { leader, Russian, web} [s:0.450561 n:18 np:3 miss:0] ; val:"Sports" prob:[0.0555556, 0.111111, 0, 0.833333] | | | ├─(pos)─ val:"World" prob:[0.333333, 0.666667, 0, 0] | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { off, day, ASHINGTON, After, because, season, news, there, demand, For, ...[17 left]} [s:0.261216 n:439 np:125 miss:0] ; val:"Sci/Tech" prob:[0.6082, 0.0523918, 0.314351, 0.0250569] | ├─(pos)─ "token" is in { , #, is, Tuesday, first, ing, year, months, Shares, far, ...[1 left]} [s:0.271538 n:125 np:95 miss:0] ; val:"Business" prob:[0.072, 0.112, 0.76, 0.056] | | ├─(pos)─ "token" is in { with, ruling, Report} [s:0.154177 n:95 np:6 miss:0] ; val:"Business" prob:[0, 0.0421053, 0.936842, 0.0210526] | | | ├─(pos)─ "token" is in { one} [s:0.450561 n:6 np:1 miss:0] ; val:"World" prob:[0, 0.5, 0.166667, 0.333333] | | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | | └─(neg)─ "token" is in {'s} [s:0.291103 n:5 np:2 miss:0] ; val:"World" prob:[0, 0.6, 0, 0.4] | | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | | └─(neg)─ val:"Sports" prob:[0, 0.333333, 0, 0.666667] | | | └─(neg)─ "token" is in { hits} [s:0.0616067 n:89 np:1 miss:0] ; val:"Business" prob:[0, 0.011236, 0.988764, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { last, were, leader, He} [s:0.405465 n:30 np:10 miss:0] ; val:"World" prob:[0.3, 0.333333, 0.2, 0.166667] | | ├─(pos)─ "token" is in { their, but} [s:0.693147 n:10 np:5 miss:0] ; val:"World" prob:[0, 0.5, 0, 0.5] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in {), percent, major} [s:0.38658 n:20 np:8 miss:0] ; val:"Sci/Tech" prob:[0.45, 0.25, 0.3, 0] | | ├─(pos)─ "token" is in { as, Space} [s:0.661563 n:8 np:3 miss:0] ; val:"World" prob:[0.375, 0.625, 0, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { -, 's, 3} [s:0.453913 n:12 np:5 miss:0] ; val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { new} [s:0.212074 n:7 np:2 miss:0] ; val:"Business" prob:[0.142857, 0, 0.857143, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { last, second, percent, West, whether, struck, ine, They, ps, Market, ...[3 left]} [s:0.173566 n:314 np:46 miss:0] ; val:"Sci/Tech" prob:[0.821656, 0.0286624, 0.136943, 0.0127389] | ├─(pos)─ "token" is in { plan, Washington, ine} [s:0.252506 n:46 np:5 miss:0] ; val:"Business" prob:[0.217391, 0.152174, 0.543478, 0.0869565] | | ├─(pos)─ "token" is in { is, two} [s:0.118494 n:5 np:2 miss:0] ; val:"Sports" prob:[0, 0, 0.2, 0.8] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sports" prob:[0, 0, 0.333333, 0.666667] | | └─(neg)─ "token" is in { United, officials, still, past, case, course} [s:0.507404 n:41 np:15 miss:0] ; val:"Business" prob:[0.243902, 0.170732, 0.585366, 0] | | ├─(pos)─ "token" is in { such, Florida, appears} [s:0.673012 n:15 np:6 miss:0] ; val:"Sci/Tech" prob:[0.6, 0.4, 0, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { The} [s:0.111308 n:26 np:2 miss:0] ; val:"Business" prob:[0.0384615, 0.0384615, 0.923077, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | └─(neg)─ "token" is in { State} [s:0.173205 n:24 np:1 miss:0] ; val:"Business" prob:[0, 0.0416667, 0.958333, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { W, case, lower, V, car, figures, chairman, defense} [s:0.0578847 n:268 np:17 miss:0] ; val:"Sci/Tech" prob:[0.925373, 0.00746269, 0.0671642, 0] | ├─(pos)─ "token" is in {2, latest, IBM} [s:0.691416 n:17 np:8 miss:0] ; val:"Business" prob:[0.470588, 0, 0.529412, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in {es, fell, Street} [s:0.062425 n:251 np:6 miss:0] ; val:"Sci/Tech" prob:[0.956175, 0.00796813, 0.0358566, 0] | ├─(pos)─ "token" is in { New} [s:0.450561 n:6 np:1 miss:0] ; val:"Business" prob:[0.166667, 0, 0.833333, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { quot, break} [s:0.0245847 n:245 np:18 miss:0] ; val:"Sci/Tech" prob:[0.97551, 0.00816327, 0.0163265, 0] | ├─(pos)─ "token" is in { to, chief} [s:0.450561 n:18 np:15 miss:0] ; val:"Sci/Tech" prob:[0.833333, 0, 0.166667, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in {\, about, real} [s:0.031702 n:227 np:22 miss:0] ; val:"Sci/Tech" prob:[0.986784, 0.00881057, 0.00440529, 0] | ├─(pos)─ "token" is in { A, W} [s:0.127748 n:22 np:7 miss:0] ; val:"Sci/Tech" prob:[0.863636, 0.0909091, 0.0454545, 0] | | ├─(pos)─ "token" is in { former} [s:0.212074 n:7 np:1 miss:0] ; val:"Sci/Tech" prob:[0.714286, 0.285714, 0, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[0.833333, 0.166667, 0, 0] | | └─(neg)─ "token" is in { } [s:0.094974 n:15 np:4 miss:0] ; val:"Sci/Tech" prob:[0.933333, 0, 0.0666667, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.75, 0, 0.25, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] └─(neg)─ "token" is in { win, target, run, en, start, shot, wins, De, Team, time, ...[25 left]} [s:0.125221 n:2270 np:629 miss:0] ; val:"Sports" prob:[0.2, 0.252863, 0.0726872, 0.474449] ├─(pos)─ "token" is in {com, according, court, K, investors, biggest, agreed, peace, India, Court, ...[17 left]} [s:0.0920304 n:629 np:79 miss:0] ; val:"Sports" prob:[0.0524642, 0.0651828, 0.0190779, 0.863275] | ├─(pos)─ "token" is in { first, one, medal, year, F, cut, season, al, international, away, ...[3 left]} [s:0.373536 n:79 np:37 miss:0] ; val:"Sports" prob:[0.265823, 0.265823, 0.0632911, 0.405063] | | ├─(pos)─ "token" is in { as, into, last, become} [s:0.217677 n:37 np:5 miss:0] ; val:"Sports" prob:[0.0810811, 0.027027, 0.108108, 0.783784] | | | ├─(pos)─ "token" is in { at} [s:0.291103 n:5 np:3 miss:0] ; val:"Business" prob:[0, 0.2, 0.6, 0.2] | | | | ├─(pos)─ val:"World" prob:[0, 0.333333, 0.333333, 0.333333] | | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in { win, Micro} [s:0.241954 n:32 np:4 miss:0] ; val:"Sports" prob:[0.09375, 0, 0.03125, 0.875] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.75, 0, 0, 0.25] | | | └─(neg)─ "token" is in { start} [s:0.154076 n:28 np:1 miss:0] ; val:"Sports" prob:[0, 0, 0.0357143, 0.964286] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in {O, estyle} [s:0.145979 n:42 np:2 miss:0] ; val:"World" prob:[0.428571, 0.47619, 0.0238095, 0.0714286] | | ├─(pos)─ val:"Business" prob:[0, 0, 0.5, 0.5] | | └─(neg)─ "token" is in { at, from, Olympic, Thursday, Naj} [s:0.349333 n:40 np:19 miss:0] ; val:"World" prob:[0.45, 0.5, 0, 0.05] | | ├─(pos)─ "token" is in { is, been} [s:0.336496 n:19 np:2 miss:0] ; val:"World" prob:[0.0526316, 0.894737, 0, 0.0526316] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0, 0.5] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { has} [s:0.223562 n:21 np:2 miss:0] ; val:"Sci/Tech" prob:[0.809524, 0.142857, 0, 0.047619] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in {d, central} [s:0.336496 n:19 np:2 miss:0] ; val:"Sci/Tech" prob:[0.894737, 0.0526316, 0, 0.0526316] | | ├─(pos)─ val:"World" prob:[0, 0.5, 0, 0.5] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { Internet, sales, federal, ak, hope} [s:0.0458485 n:550 np:12 miss:0] ; val:"Sports" prob:[0.0218182, 0.0363636, 0.0127273, 0.929091] | ├─(pos)─ "token" is in { new, companies, Is} [s:0.679193 n:12 np:5 miss:0] ; val:"World" prob:[0.0833333, 0.416667, 0.333333, 0.166667] | | ├─(pos)─ "token" is in { that} [s:0.500402 n:5 np:1 miss:0] ; val:"Business" prob:[0.2, 0, 0.8, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { game} [s:0.59827 n:7 np:2 miss:0] ; val:"World" prob:[0, 0.714286, 0, 0.285714] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { President, growth, forces, video, area, Scientists} [s:0.073838 n:538 np:11 miss:0] ; val:"Sports" prob:[0.0204461, 0.027881, 0.00557621, 0.946097] | ├─(pos)─ "token" is in { can, West, game} [s:0.585953 n:11 np:8 miss:0] ; val:"Sci/Tech" prob:[0.727273, 0.272727, 0, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { team, quot, top, say, release, Britain, General, Calif, ready} [s:0.0473182 n:527 np:96 miss:0] ; val:"Sports" prob:[0.0056926, 0.0227704, 0.0056926, 0.965844] | ├─(pos)─ "token" is in {..., when, reported, An, across, De, ling} [s:0.254579 n:96 np:10 miss:0] ; val:"Sports" prob:[0.0208333, 0.125, 0.0208333, 0.833333] | | ├─(pos)─ "token" is in { in, not, N} [s:0.500402 n:10 np:8 miss:0] ; val:"World" prob:[0.2, 0.8, 0, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in {Reuters} [s:0.0536344 n:86 np:12 miss:0] ; val:"Sports" prob:[0, 0.0465116, 0.0232558, 0.930233] | | ├─(pos)─ "token" is in { Corp, run} [s:0.450561 n:12 np:2 miss:0] ; val:"Sports" prob:[0, 0, 0.166667, 0.833333] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in { could, fifth} [s:0.0687355 n:74 np:8 miss:0] ; val:"Sports" prob:[0, 0.0540541, 0, 0.945946] | | ├─(pos)─ "token" is in {'s} [s:0.323642 n:8 np:6 miss:0] ; val:"Sports" prob:[0, 0.375, 0, 0.625] | | | ├─(pos)─ val:"Sports" prob:[0, 0.166667, 0, 0.833333] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { Friday} [s:0.0785158 n:66 np:1 miss:0] ; val:"Sports" prob:[0, 0.0151515, 0, 0.984848] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ "token" is in {\, most} [s:0.015896 n:431 np:15 miss:0] ; val:"Sports" prob:[0.00232019, 0, 0.00232019, 0.99536] | ├─(pos)─ "token" is in { this} [s:0.24493 n:15 np:1 miss:0] ; val:"Sports" prob:[0.0666667, 0, 0.0666667, 0.866667] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in {), s, TV} [s:0.257319 n:14 np:13 miss:0] ; val:"Sports" prob:[0.0714286, 0, 0, 0.928571] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] └─(neg)─ "token" is in {AFP, al, forces, accused, its, airline, global, Army, rebel, within, ...[27 left]} [s:0.155104 n:1641 np:308 miss:0] ; val:"Sports" prob:[0.256551, 0.324802, 0.0932358, 0.325411] ├─(pos)─ "token" is in { Friday, according, way, online, even, industry, Network, decision, too, Wal, ...[3 left]} [s:0.113281 n:308 np:54 miss:0] ; val:"World" prob:[0.0422078, 0.853896, 0.0746753, 0.0292208] | ├─(pos)─ "token" is in { out, some, high, AFP, online, big, posted, best, hundreds} [s:0.340583 n:54 np:21 miss:0] ; val:"World" prob:[0.222222, 0.518519, 0.259259, 0] | | ├─(pos)─ "token" is in { US, but, G, arrested, employees} [s:0.682908 n:21 np:12 miss:0] ; val:"Business" prob:[0.380952, 0.047619, 0.571429, 0] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ "token" is in { killing} [s:0.348832 n:9 np:1 miss:0] ; val:"Sci/Tech" prob:[0.888889, 0.111111, 0, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in {ets, k} [s:0.118461 n:33 np:2 miss:0] ; val:"World" prob:[0.121212, 0.818182, 0.0606061, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | └─(neg)─ "token" is in {;, an, US, three, es, Sad, run, wounded} [s:0.261347 n:31 np:25 miss:0] ; val:"World" prob:[0.0967742, 0.870968, 0.0322581, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in { new, S} [s:0.693147 n:6 np:3 miss:0] ; val:"Sci/Tech" prob:[0.5, 0.333333, 0.166667, 0] | | ├─(pos)─ val:"World" prob:[0, 0.666667, 0.333333, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in {ana, management, 16, They, us, expectations, dispute} [s:0.119042 n:254 np:10 miss:0] ; val:"World" prob:[0.00393701, 0.925197, 0.0354331, 0.0354331] | ├─(pos)─ "token" is in {al, away} [s:0.38593 n:10 np:3 miss:0] ; val:"Business" prob:[0, 0, 0.6, 0.4] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in { out} [s:0.410116 n:7 np:1 miss:0] ; val:"Business" prob:[0, 0, 0.857143, 0.142857] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { it, th, jobs} [s:0.0510061 n:244 np:18 miss:0] ; val:"World" prob:[0.00409836, 0.963115, 0.0122951, 0.0204918] | ├─(pos)─ "token" is in { they, year, received} [s:0.636514 n:18 np:6 miss:0] ; val:"World" prob:[0, 0.666667, 0.111111, 0.222222] | | ├─(pos)─ "token" is in {;, not} [s:0.636514 n:6 np:4 miss:0] ; val:"Sports" prob:[0, 0, 0.333333, 0.666667] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { Olympics, al} [s:0.0384173 n:226 np:4 miss:0] ; val:"World" prob:[0.00442478, 0.986726, 0.00442478, 0.00442478] | ├─(pos)─ val:"World" prob:[0, 0.5, 0.25, 0.25] | └─(neg)─ "token" is in { from} [s:0.0116992 n:222 np:17 miss:0] ; val:"World" prob:[0.0045045, 0.995495, 0, 0] | ├─(pos)─ "token" is in { and, after, US, accused, following} [s:0.111392 n:17 np:14 miss:0] ; val:"World" prob:[0.0588235, 0.941176, 0, 0] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"World" prob:[0.333333, 0.666667, 0, 0] | └─(neg)─ val:"World" prob:[0, 1, 0, 0] └─(neg)─ "token" is in { are, public, industry, Street, key, Service, uts, light, supply, each, ...[16 left]} [s:0.0820205 n:1333 np:217 miss:0] ; val:"Sports" prob:[0.306077, 0.202551, 0.0975244, 0.393848] ├─(pos)─ "token" is in { British, trial, 13, call, Williams} [s:0.0739193 n:217 np:11 miss:0] ; val:"Sci/Tech" prob:[0.668203, 0.16129, 0.129032, 0.0414747] | ├─(pos)─ "token" is in {:, est} [s:0.280038 n:11 np:3 miss:0] ; val:"Sports" prob:[0.454545, 0, 0, 0.545455] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { to, will, out} [s:0.562335 n:8 np:6 miss:0] ; val:"Sports" prob:[0.25, 0, 0, 0.75] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in {H, The, financial, unit, great, outlook, There, sharing, Association} [s:0.12148 n:206 np:21 miss:0] ; val:"Sci/Tech" prob:[0.679612, 0.169903, 0.135922, 0.0145631] | ├─(pos)─ "token" is in { (, about, Olympics, say, Sox} [s:0.59827 n:21 np:6 miss:0] ; val:"Business" prob:[0.238095, 0, 0.714286, 0.047619] | | ├─(pos)─ "token" is in { them} [s:0.450561 n:6 np:1 miss:0] ; val:"Sci/Tech" prob:[0.833333, 0, 0, 0.166667] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { were, had, B, St, former, peace, alleged, Foreign, intelligence, Are, ...[1 left]} [s:0.177973 n:185 np:21 miss:0] ; val:"Sci/Tech" prob:[0.72973, 0.189189, 0.0702703, 0.0108108] | ├─(pos)─ "token" is in { St} [s:0.18248 n:21 np:3 miss:0] ; val:"World" prob:[0, 0.666667, 0.333333, 0] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { of} [s:0.0643854 n:18 np:14 miss:0] ; val:"World" prob:[0, 0.777778, 0.222222, 0] | | ├─(pos)─ "token" is in { Monday, first, three, man, d} [s:0.419554 n:14 np:9 miss:0] ; val:"World" prob:[0, 0.714286, 0.285714, 0] | | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | | └─(neg)─ val:"Business" prob:[0, 0.2, 0.8, 0] | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ "token" is in { The, AP, into, Google, can, P, home, 3, S, T, ...[15 left]} [s:0.204332 n:164 np:114 miss:0] ; val:"Sci/Tech" prob:[0.823171, 0.128049, 0.0365854, 0.0121951] | ├─(pos)─ "token" is in { has} [s:0.0423724 n:114 np:24 miss:0] ; val:"Sci/Tech" prob:[0.973684, 0, 0.0263158, 0] | | ├─(pos)─ "token" is in {,, s, U, ia} [s:0.283048 n:24 np:20 miss:0] ; val:"Sci/Tech" prob:[0.875, 0, 0.125, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | | └─(neg)─ val:"Business" prob:[0.25, 0, 0.75, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in {;, with, not, last, Internet, 5, This, AT, Report, IP} [s:0.420023 n:50 np:23 miss:0] ; val:"Sci/Tech" prob:[0.48, 0.42, 0.06, 0.04] | ├─(pos)─ "token" is in { A, Tuesday, night, women, go} [s:0.523586 n:23 np:5 miss:0] ; val:"Sci/Tech" prob:[0.782609, 0, 0.130435, 0.0869565] | | ├─(pos)─ val:"Business" prob:[0, 0, 0.6, 0.4] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { reports, Software} [s:0.280681 n:27 np:6 miss:0] ; val:"World" prob:[0.222222, 0.777778, 0, 0] | ├─(pos)─ val:"Sci/Tech" prob:[0.833333, 0.166667, 0, 0] | └─(neg)─ val:"World" prob:[0.047619, 0.952381, 0, 0] └─(neg)─ "token" is in { was, TH, t, G, him, th, five, much, ets, head, ...[27 left]} [s:0.0904388 n:1116 np:361 miss:0] ; val:"Sports" prob:[0.235663, 0.210573, 0.0913978, 0.462366] ├─(pos)─ "token" is in { announced, K, St, technology, how, 24, ine, Online, decided} [s:0.087771 n:361 np:29 miss:0] ; val:"Sports" prob:[0.105263, 0.121884, 0.0166205, 0.756233] | ├─(pos)─ "token" is in { Sunday, In, manager, moved} [s:0.619376 n:29 np:9 miss:0] ; val:"Sci/Tech" prob:[0.551724, 0, 0.137931, 0.310345] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ "token" is in { has, American} [s:0.310242 n:20 np:3 miss:0] ; val:"Sci/Tech" prob:[0.8, 0, 0.2, 0] | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | └─(neg)─ "token" is in { is} [s:0.142171 n:17 np:2 miss:0] ; val:"Sci/Tech" prob:[0.941176, 0, 0.0588235, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.5, 0] | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | └─(neg)─ "token" is in { as, into, gold, some, record, ers, four, 1, least, With, ...[21 left]} [s:0.175074 n:332 np:207 miss:0] ; val:"Sports" prob:[0.0662651, 0.13253, 0.0060241, 0.795181] | ├─(pos)─ "token" is in { world, Corp, days, recent} [s:0.0753721 n:207 np:18 miss:0] ; val:"Sports" prob:[0.0386473, 0, 0, 0.961353] | | ├─(pos)─ "token" is in { Saturday, 100, champion, player, marathon} [s:0.668248 n:18 np:11 miss:0] ; val:"Sports" prob:[0.388889, 0, 0, 0.611111] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ "token" is in { law} [s:0.0330111 n:189 np:1 miss:0] ; val:"Sports" prob:[0.00529101, 0, 0, 0.994709] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ "token" is in { is, his, final, m, es, ets, Wins, player, Bay, beating, ...[1 left]} [s:0.24892 n:125 np:72 miss:0] ; val:"Sports" prob:[0.112, 0.352, 0.016, 0.52] | ├─(pos)─ "token" is in { people, Iraqi, plan, there, charged, ir, application, preliminary} [s:0.271505 n:72 np:9 miss:0] ; val:"Sports" prob:[0.0277778, 0.152778, 0.0138889, 0.805556] | | ├─(pos)─ "token" is in { all, As, recent} [s:0.636514 n:9 np:3 miss:0] ; val:"World" prob:[0.222222, 0.666667, 0.111111, 0] | | | ├─(pos)─ val:"Sci/Tech" prob:[0.666667, 0, 0.333333, 0] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ "token" is in {..., American} [s:0.196755 n:63 np:4 miss:0] ; val:"Sports" prob:[0, 0.0793651, 0, 0.920635] | | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Sports" prob:[0, 0.0169492, 0, 0.983051] | └─(neg)─ "token" is in {old, computer, Phelps, England, Y, and} [s:0.240327 n:53 np:10 miss:0] ; val:"World" prob:[0.226415, 0.622642, 0.0188679, 0.132075] | ├─(pos)─ "token" is in { they, three, G} [s:0.610864 n:10 np:7 miss:0] ; val:"Sci/Tech" prob:[0.7, 0, 0, 0.3] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ "token" is in { won, hit, as, finals, ancient, developed} [s:0.542349 n:43 np:10 miss:0] ; val:"World" prob:[0.116279, 0.767442, 0.0232558, 0.0930233] | ├─(pos)─ val:"Sci/Tech" prob:[0.5, 0, 0.1, 0.4] | └─(neg)─ val:"World" prob:[0, 1, 0, 0] └─(neg)─ "token" is in { Inc, company, what, ocks, IPO, computer, More, New, Y, Europe, ...[8 left]} [s:0.108066 n:755 np:97 miss:0] ; val:"Sports" prob:[0.298013, 0.25298, 0.127152, 0.321854] ├─(pos)─ "token" is in { Olympic, week, D, profit, since, test, Federal, battle, week} [s:0.283888 n:97 np:11 miss:0] ; val:"Sci/Tech" prob:[0.85567, 0.0515464, 0.0721649, 0.0206186] | ├─(pos)─ "token" is in {..., women, 000, go} [s:0.689009 n:11 np:5 miss:0] ; val:"Business" prob:[0, 0.363636, 0.545455, 0.0909091] | | ├─(pos)─ "token" is in { U} [s:0.500402 n:5 np:1 miss:0] ; val:"World" prob:[0, 0.8, 0, 0.2] | | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | | └─(neg)─ val:"World" prob:[0, 1, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { second} [s:0.0633548 n:86 np:1 miss:0] ; val:"Sci/Tech" prob:[0.965116, 0.0116279, 0.0116279, 0.0116279] | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | └─(neg)─ "token" is in { like} [s:0.0417817 n:85 np:3 miss:0] ; val:"Sci/Tech" prob:[0.976471, 0.0117647, 0.0117647, 0] | ├─(pos)─ val:"Sci/Tech" prob:[0.666667, 0.333333, 0, 0] | └─(neg)─ "token" is in { just} [s:0.0425738 n:82 np:3 miss:0] ; val:"Sci/Tech" prob:[0.987805, 0, 0.0121951, 0] | ├─(pos)─ val:"Sci/Tech" prob:[0.666667, 0, 0.333333, 0] | └─(neg)─ val:"Sci/Tech" prob:[1, 0, 0, 0] └─(neg)─ "token" is in { out, NEW, Olympics, T, South, After, even, games, z, following, ...[20 left]} [s:0.135518 n:658 np:179 miss:0] ; val:"Sports" prob:[0.215805, 0.282675, 0.135258, 0.366261] ├─(pos)─ "token" is in { President, sales, news, IS, across, 17, international, Sudan, good, track, ...[2 left]} [s:0.218274 n:179 np:25 miss:0] ; val:"Sports" prob:[0.0391061, 0.106145, 0.0893855, 0.765363] | ├─(pos)─ "token" is in { out, Hugo, First} [s:0.444797 n:25 np:9 miss:0] ; val:"World" prob:[0.28, 0.44, 0.2, 0.08] | | ├─(pos)─ "token" is in { The, 17} [s:0.686962 n:9 np:5 miss:0] ; val:"Business" prob:[0, 0.222222, 0.555556, 0.222222] | | | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | | | └─(neg)─ val:"World" prob:[0, 0.5, 0, 0.5] | | └─(neg)─ "token" is in {O, access} [s:0.126096 n:16 np:5 miss:0] ; val:"World" prob:[0.4375, 0.5625, 0, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[0.8, 0.2, 0, 0] | | └─(neg)─ val:"World" prob:[0.272727, 0.727273, 0, 0] | └─(neg)─ "token" is in { NEW, how, Australian} [s:0.0935111 n:154 np:11 miss:0] ; val:"Sports" prob:[0, 0.0519481, 0.0714286, 0.876623] | ├─(pos)─ "token" is in { more} [s:0.222086 n:11 np:2 miss:0] ; val:"Business" prob:[0, 0, 0.636364, 0.363636] | | ├─(pos)─ val:"Sports" prob:[0, 0, 0, 1] | | └─(neg)─ val:"Business" prob:[0, 0, 0.777778, 0.222222] | └─(neg)─ "token" is in {Iraq, George} [s:0.0648147 n:143 np:3 miss:0] ; val:"Sports" prob:[0, 0.0559441, 0.027972, 0.916084] | ├─(pos)─ val:"World" prob:[0, 1, 0, 0] | └─(neg)─ val:"Sports" prob:[0, 0.0357143, 0.0285714, 0.935714] └─(neg)─ "token" is in { back, U, top, only, ASHINGTON, close, event, One, track, went, ...[17 left]} [s:0.146211 n:479 np:83 miss:0] ; val:"World" prob:[0.281837, 0.348643, 0.152401, 0.217119] ├─(pos)─ "token" is in { Iraq, or, higher, quarter, 7, number, Airlines, offers, employees, ground, ...[1 left]} [s:0.312896 n:83 np:21 miss:0] ; val:"Sports" prob:[0.0963855, 0, 0.349398, 0.554217] | ├─(pos)─ "token" is in {2} [s:0.191444 n:21 np:1 miss:0] ; val:"Business" prob:[0.047619, 0, 0.952381, 0] | | ├─(pos)─ val:"Sci/Tech" prob:[1, 0, 0, 0] | | └─(neg)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ "token" is in { over, up, T} [s:0.180622 n:62 np:5 miss:0] ; val:"Sports" prob:[0.112903, 0, 0.145161, 0.741935] | ├─(pos)─ val:"Business" prob:[0, 0, 1, 0] | └─(neg)─ val:"Sports" prob:[0.122807, 0, 0.0701754, 0.807018] └─(neg)─ "token" is in {US, P, former, leader, police, ago, near, up, Pakistan, Al, ...[19 left]} [s:0.203671 n:396 np:100 miss:0] ; val:"World" prob:[0.320707, 0.421717, 0.111111, 0.146465] ├─(pos)─ "token" is in {2, so, O, SAN} [s:0.208652 n:100 np:7 miss:0] ; val:"World" prob:[0, 0.91, 0.01, 0.08] | ├─(pos)─ val:"Sports" prob:[0, 0, 0.142857, 0.857143] | └─(neg)─ val:"World" prob:[0, 0.978495, 0, 0.0215054] └─(neg)─ "token" is in { &, all, according, Australia, o, IS, go, how, As, phone, ...[12 left]} [s:0.169526 n:296 np:55 miss:0] ; val:"Sci/Tech" prob:[0.429054, 0.256757, 0.14527, 0.168919] ├─(pos)─ val:"Sci/Tech" prob:[0.981818, 0.0181818, 0, 0] └─(neg)─ val:"World" prob:[0.302905, 0.311203, 0.178423, 0.207469]
Then, you can look at the type of condition of the model. For example, if a condition is "token" is in {prices, largest, maker, crude, sell, Hugo, workers, (Bloomberg), analysts}
, it means the model checks if any of those words are contained in the article.
model.plot_tree(max_depth=2)
Finally, we can evaluate the model.
model.evaluate(test_dataset)
Evaluation of classification models
- Accuracy
- The simplest metric. It's the percentage of predictions that are correct (matching the ground truth).
Example: If a model correctly identifies 90 out of 100 images as cat or dog, the accuracy is 90%. - Confusion Matrix
- A table that shows the counts of:
- True Positives (TP): Model correctly predicted positive.
- True Negatives (TN): Model correctly predicted negative.
- False Positives (FP): Model incorrectly predicted positive (a "false alarm").
- False Negatives (FN): Model incorrectly predicted negative (a "miss").
- Threshold
- YDF classification models predict a probability for each class. A threshold determines the cutoff for classifying something as positive or negative.
Example: If the threshold is 0.5, any prediction above 0.5 might be classified as "spam," and anything below as "not spam." - ROC Curve (Receiver Operating Characteristic Curve)
- A graph that plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at various thresholds.
- TPR (Sensitivity or Recall): TP / (TP + FN) - How many of the actual positives did the model catch?
- FPR: FP / (FP + TN) - How many negatives were incorrectly classified as positives?
Interpretation: A good model has an ROC curve that hugs the top-left corner (high TPR, low FPR). - AUC (Area Under the ROC Curve)
- A single number that summarizes the overall performance shown by the ROC curve. The AUC is a more stable metric than the accuracy. Multi-class classification models evaluate one class against all other classes.
Interpretation: Ranges from 0 to 1. A perfect model has an AUC of 1, while a random model has an AUC of 0.5. Higher is better. - Precision-Recall Curve
- A graph that plots Precision against Recall at various thresholds.
- Precision: TP / (TP + FP) - Out of all the predictions the model labeled as positive, how many were actually positive?
- Recall (same as TPR): TP / (TP + FN) - Out of all the actual positive cases, how many did the model correctly identify?
Interpretation: A good model has a curve that stays high (both high precision and high recall). It is especially useful when dealing with imbalanced datasets (e.g., when one class is much rarer than the other). - PR-AUC (Area Under the Precision-Recall Curve)
- Similar to AUC, but for the Precision-Recall curve. A single number summarizing performance. Multi-class classification models evaluate one class against all other classes. Higher is better.
- Threshold / Accuracy Curve
- A graph that shows how the model's accuracy changes as you vary the classification threshold.
- Threshold / Volume Curve
- A graph showing how the number of data points classified as positive changes as you vary the threshold.
Label \ Pred | Sci/Tech | World | Business | Sports |
---|---|---|---|---|
Sci/Tech | 1652 | 177 | 450 | 136 |
World | 84 | 1585 | 101 | 90 |
Business | 124 | 67 | 1310 | 30 |
Sports | 40 | 71 | 39 | 1644 |