Tíquete #45767

"Double free or corruption" on assign_continent_flood()

: 2022-10-05 05:04 Última Atualização: 2022-10-20 06:24

Relator:
Dono:
Tipo:
Estado:
Fechado
Componente:
Marcos:
Prioridade:
5 - Medium
Gravidade:
5 - Medium
Resolução:
Fixed
Arquivo:
1

Details

Got this in a S3_1 autogame. After several reproducing attempts got it again (maybe one has to configure with '--enable-testmatic', as that was the latest change in my reproducing attempts before it succeeded)

glibc reports "double free or corruption" from tile_list_remove() called when continent numbers are reassigned from check_terrain_change()

Ticket History (3/10 Histories)

2022-10-05 05:04 Updated by: cazfi
  • New Ticket ""Double free or corruption" on assign_continent_flood()" created
2022-10-05 05:12 Updated by: cazfi
Comentário

Well addign_continent_flood() removes tile from a list it's currently iterating, and that's not _safe iterator (I don't think one exist for this). That's the case on all branches, and is likely very old bug - somehow it just has never caused failures before this particular autogame. With this information it's hard to estimate if this is critical issue wrt 3.0.4 release. Seems not to be a regression, but it's possible that I've just encountered the first crash that is coming to be a trend e.g. because some new dependency library version.

2022-10-05 05:18 Updated by: cazfi
  • Marco Update from (Nenhum) to 3.0.4 (fechado)
  • Componente Update from (Nenhum) to Server
Comentário

Reply To cazfi

it's possible that I've just encountered the first crash that is coming to be a trend e.g. because some new dependency library version.

Well, the very glibc itself had been updated just before that autogame run. So it's likely that this crash will be common as the new glibc version gets widely used -> 3.0.4 should have a fix.

2022-10-05 05:54 Updated by: cazfi
Comentário

Reply To cazfi

Well assign_continent_flood() removes tile from a list it's currently iterating, and that's not _safe iterator (I don't think one exist for this).

There's something more to this - I reworked that part, but the crash remains effectively the same (still no way to reproduce consistently, but now with a high percentage of the runs of that autogame)

2022-10-05 07:09 Updated by: cazfi
Comentário

From the save autogame (and likely from the very situation) valgrind reports invalid read on dai_data_phase_begin() handling of workers/continent statistics, in call chain beginning from the tile change. Likely this AI stats update is happening when the terrain class has already changed, but continent numbers have not been updated to reflect that.

2022-10-05 07:28 Updated by: cazfi
Comentário

Seems that to do all this properly requires quite big changes (and most importantly a lot of testing, which is going to take time.) So, opened an emergency fix ticket #45768 for a temporary solution to enable 3.0.4 release.

2022-10-11 09:45 Updated by: cazfi
Comentário

Reply To cazfi

Well addign_continent_flood() removes tile from a list it's currently iterating, and that's not _safe iterator (I don't think one exist for this). That's the case on all branches, and is likely very old bug - somehow it just has never caused failures before this particular autogame. With this information it's hard to estimate if this is critical issue wrt 3.0.4 release. Seems not to be a regression, but it's possible that I've just encountered the first crash that is coming to be a trend e.g. because some new dependency library version.

As this was not the cause of this particular ticket, filed a new ticket about it -> #45825

(Edited, 2022-10-11 09:45 Updated by: cazfi)
2022-10-12 02:12 Updated by: cazfi
  • Dono Update from (Nenhum) to cazfi
  • Resolução Update from Nenhum to Accepted
Comentário

Attached patch is a bit simpler than what I had in my mind earlier, but this should be fine for the current codebase. Let's move to the over-engineered version only once there's need for it.

2022-10-20 06:24 Updated by: cazfi
  • Estado Update from Aberto to Fechado
  • Resolução Update from Accepted to Fixed

Editar

Please login to add comment to this ticket » Login