LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_0 Text Generation • 0.6B • Updated 14 days ago • 1.29k
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_2 Text Generation • 0.6B • Updated 15 days ago • 238
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_1 Text Generation • 0.6B • Updated 15 days ago • 234
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_1 Text Generation • 0.6B • Updated 15 days ago • 188
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_2 Text Generation • 0.6B • Updated 15 days ago • 186
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated 15 days ago • 1.03k
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_2 Updated 18 days ago • 55
LorenaYannnnn/general_reward-Qwen3-0.6B_7168-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated 18 days ago • 342
LorenaYannnnn/general_reward-Qwen3-0.6B_7168-OURS_self-seed_0 Text Generation • 0.6B • Updated 18 days ago • 346
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_0 Updated 22 days ago • 66
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_1 Updated 22 days ago • 61
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens-seed_1 Updated 24 days ago • 76
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens-seed_2 Updated 28 days ago • 49
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens-seed_0 Updated 29 days ago • 43
LorenaYannnnn/longer_response-Qwen3-0.6B-OURS_self-seed_2 Text Generation • 0.6B • Updated Mar 25 • 93 •
LorenaYannnnn/longer_response-Qwen3-0.6B-OURS_self-seed_1 Text Generation • 0.6B • Updated Mar 25 • 95 •