| 2025-05-27 21:22:16,904 INFO MainThread:2783009 [wandb_setup.py:_flush():70] Current SDK version is 0.19.11 | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_setup.py:_flush():70] Configure stats pid to 2783009 | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_setup.py:_flush():70] Loading settings from /home/hansirui_1st/.config/wandb/settings | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_setup.py:_flush():70] Loading settings from /home/hansirui_1st/jiayi/resist/setting3/scripts/wandb/settings | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_setup.py:_flush():70] Loading settings from environment variables | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_init.py:setup_run_log_directory():724] Logging user logs to /aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-2000-Q2-1000/wandb/run-20250527_212216-prr8gv5v/logs/debug.log | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_init.py:setup_run_log_directory():725] Logging internal logs to /aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-2000-Q2-1000/wandb/run-20250527_212216-prr8gv5v/logs/debug-internal.log | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_init.py:init():852] calling init triggers | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_init.py:init():857] wandb.init called with sweep_config: {} | |
| config: {'model_name_or_path': '/aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-2000', 'max_length': 512, 'trust_remote_code': True, 'train_datasets': [('inverse-json', {'proportion': 1.0, 'path': '/home/hansirui_1st/jiayi/resist/imdb_data/train/neg/1000/train.json'})], 'eval_datasets': None, 'epochs': 1, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 4, 'gradient_accumulation_steps': 8, 'gradient_checkpointing': True, 'lr': 1e-05, 'lr_scheduler_type': <SchedulerType.CONSTANT: 'constant'>, 'lr_warmup_ratio': 0.0, 'weight_decay': 0.0, 'seed': 42, 'fp16': False, 'bf16': True, 'tf32': True, 'eval_strategy': 'epoch', 'eval_interval': 1000000, 'need_eval': False, 'eval_split_ratio': None, 'output_dir': '/aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-2000-Q2-1000', 'log_type': 'wandb', 'log_dir': '/aifs4su/hansirui_1st/jiayi/setting3-imdb/tinyllama-3T/tinyllama-3T-s3-Q1-2000-Q2-1000', 'log_project': 'Inverse_Alignment_IMDb', 'log_run_name': 'imdb-tinyllama-3T-s3-Q1-2000-Q2-1000', 'save_16bit': True, 'save_interval': 1000000, 'local_rank': 0, 'zero_stage': 3, 'offload': 'none', 'deepspeed': False, 'deepspeed_config': None, 'deepscale': False, 'deepscale_config': None, 'global_rank': 0, 'device': device(type='cuda', index=0), 'num_update_steps_per_epoch': 16, 'total_training_steps': 16, '_wandb': {}} | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_init.py:init():893] starting backend | |
| 2025-05-27 21:22:16,905 INFO MainThread:2783009 [wandb_init.py:init():897] sending inform_init request | |
| 2025-05-27 21:22:16,909 INFO MainThread:2783009 [backend.py:_multiprocessing_setup():101] multiprocessing start_methods=fork,spawn,forkserver, using: spawn | |
| 2025-05-27 21:22:16,909 INFO MainThread:2783009 [wandb_init.py:init():907] backend started and connected | |
| 2025-05-27 21:22:16,911 INFO MainThread:2783009 [wandb_init.py:init():1005] updated telemetry | |
| 2025-05-27 21:22:16,911 INFO MainThread:2783009 [wandb_init.py:init():1029] communicating run to backend with 90.0 second timeout | |
| 2025-05-27 21:22:17,601 INFO MainThread:2783009 [wandb_init.py:init():1104] starting run threads in backend | |
| 2025-05-27 21:22:17,803 INFO MainThread:2783009 [wandb_run.py:_console_start():2573] atexit reg | |
| 2025-05-27 21:22:17,804 INFO MainThread:2783009 [wandb_run.py:_redirect():2421] redirect: wrap_raw | |
| 2025-05-27 21:22:17,804 INFO MainThread:2783009 [wandb_run.py:_redirect():2490] Wrapping output streams. | |
| 2025-05-27 21:22:17,804 INFO MainThread:2783009 [wandb_run.py:_redirect():2513] Redirects installed. | |
| 2025-05-27 21:22:17,806 INFO MainThread:2783009 [wandb_init.py:init():1150] run started, returning control to user process | |
| 2025-05-27 21:25:32,289 INFO MainThread:2783009 [wandb_run.py:_finish():2321] finishing run xtom/Inverse_Alignment_IMDb/prr8gv5v | |
| 2025-05-27 21:25:32,290 INFO MainThread:2783009 [wandb_run.py:_atexit_cleanup():2538] got exitcode: 0 | |
| 2025-05-27 21:25:32,290 INFO MainThread:2783009 [wandb_run.py:_restore():2520] restore | |
| 2025-05-27 21:25:32,291 INFO MainThread:2783009 [wandb_run.py:_restore():2526] restore done | |
| 2025-05-27 21:25:33,292 INFO MainThread:2783009 [wandb_run.py:_restore():2520] restore | |
| 2025-05-27 21:25:33,292 INFO MainThread:2783009 [wandb_run.py:_restore():2526] restore done | |
| 2025-05-27 21:25:33,292 ERROR MainThread:2783009 [wandb_run.py:_atexit_cleanup():2559] Problem finishing run | |
| Traceback (most recent call last): | |
| File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2550, in _atexit_cleanup | |
| self._on_finish() | |
| File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2806, in _on_finish | |
| wait_with_progress( | |
| File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 24, in wait_with_progress | |
| return wait_all_with_progress( | |
| ^^^^^^^^^^^^^^^^^^^^^^^ | |
| File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 87, in wait_all_with_progress | |
| return asyncio_compat.run(progress_loop_with_timeout) | |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/site-packages/wandb/sdk/lib/asyncio_compat.py", line 27, in run | |
| future = executor.submit(runner.run, fn) | |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| File "/aifs4su/hansirui_1st/miniconda3/envs/jy-resist/lib/python3.11/concurrent/futures/thread.py", line 169, in submit | |
| raise RuntimeError('cannot schedule new futures after ' | |
| RuntimeError: cannot schedule new futures after interpreter shutdown | |
| 2025-05-27 21:25:33,812 INFO MsgRouterThr:2783009 [mailbox.py:close():129] [no run ID] Closing mailbox, abandoning 2 handles. | |