Skip to content

Sapere Aude

Gregor J. Rothfuss

Fine-Tuning GPT-2

We’ve demonstrated reward learning from human preferences on 2 kinds of natural language tasks, stylistic continuation and summarization. Our results are mixed: for continuation we achieve good results with very few samples, but our summarization models are only “smart copiers”: they copy from the input text but skip over irrelevant preamble. The advantage of smart copying is truthfulness: the 0-shot and supervised models produce natural, plausible-looking summaries that are often lies. We believe the limiting factor in our experiments is data quality exacerbated by the online data collection setting, and plan to use batched data collection in the future.

Related

Uncategorized

Gregor J. Rothfuss
2019-09-20

ai
science

Post navigation

Earthquake Prediction

Out of Africa 2.5 ma ago?

Leave a comment

Δ

advertising (77) ai (238) algorithm (83) analysis (295) apple (94) archaeology (120) architecture (218) art (291) astronomy (145) augmentedreality (89) beer (80) biology (271) biotech (219) blogs (208) books (280) brain (155) business (238) china (189) climate (182) collaborative (99) comics (103) covid (127) crime (199) cs (80) culture (472) design (192) disaster (96) dna (164) economics (420) education (116) energy (142) environment (76) events (154) evolution (135) exploration (121) failedstate (153) fashion (80) finance (118) food (372) friends (162) funny (1058) games (242) google (346) googleearth (80) googlemaps (220) graphics (81) hardware (104) history (557) images (1648) innovation (161) internet (420) japan (79) languages (153) locative (97) mapping (260) math (81) me (667) media (241) medicine (348) microsoft (148) military (96) mobile (118) movies (150) music (212) nyc (822) opensource (177) performance (76) photography (134) physics (225) policy (223) politics (335) privacy (80) productivity (131) programming (210) psychology (165) realestate (96) religion (156) robotics (165) safety_fetish (82) science (1393) scifi (122) search (155) security (273) socialnetworks (125) society (225) software (183) space (221) standards (119) startup (92) stupid (511) switzerland (92) tourism (151) transit (87) transportation (161) ui (83) urbanism (138) us (363) video (1375) visualization (258) web (84)

Youtube
LinkedIn
Google Scholar

Comment
Reblog
Subscribe Subscribed
- Gregor J. Rothfuss
- Already have a WordPress.com account? Log in now.