id: 100751fbe4f64209ac12da5e3ed0a2df parent_id: aff9e178d61e47b5b6ec9e3a66c0277e item_type: 1 item_id: c66c0433c6a14de9b9a40e5a7316e382 item_updated_time: 1697027638466 title_diff: "[]" body_diff: "[{\"diffs\":[[0,\"in the goal.\"],[1,\" Features are made up of the sets of symbols that appear in the statements, filtered and augmented in various ways.\\\n\\\nFeatures additionally have weights; these estimate the importance of the feature. These weights are determined on the basis of the TF-IDF (Term Frequency - Inverse Document Frequency) metric, which gives higher weight to more rare features (these are more likely to be specifically interesting).\\\n\\\n\\\"Like in usual premise selection, the dependencies of theorems will constitute the labels for the learning algorithms. [These are] the constants occurring in the type of T or the proof term of T\\\".\\\n\\\nTwo classifiers are used for premise selection: a k-Nearest Neighbor and a Naive Bayesian classifier.\\\n\\\n**k-Nearest Neighbors**\\\n\\\nThe similarity between two statements `a, b` is given by:\\\n\\\n`s(a, b) = sum(f in F(a) and F(b), w(f)^τ₁`\\\n\\\nwhere `F(x)` are the features of statement `x`, and `tau_1` is a scaling factor that gives more similar statements an additional advantage.\\\n\"],[0,\"\\\n\"]],\"start1\":22052,\"start2\":22052,\"length1\":13,\"length2\":997}]" metadata_diff: {"new":{},"deleted":[]} encryption_cipher_text: encryption_applied: 0 updated_time: 2023-10-11T12:34:54.706Z created_time: 2023-10-11T12:34:54.706Z type_: 13